Tudip
28 March 2020
As technology is evolving speedily day by day everyone wants to become proficient in that new stuff. If you know DB and their relationship with the help of schema you’re superman and then it comes in the field which is BigQuery (BQ).
When we have a large amount of data from various sources BigQuery is the best option, as it gives you liberty to focus on your analytics and comprehension skill also it is serverless and highly available, before going ahead for any BigQuery operation make sure you have an infrastructure in place that can store high magnitude of data.
Google Cloud Platform provides a suite of cloud services such as data storage, data analysis and machine learning so that you can use machine learning in your data using BQ ML. BigQuery is a REST Based web service which allows users to run complex analytical SQL based queries under large sets of data.
Let’s find some Bigquery super points below and will see important one here
The power of data
Take an example we have a data of all the users who have submitted survey for one projects, what we will do is that we are going to fetch data of all the user by using below query, now the thing is that it will take hardly 3.2 second to show the expected results as per your query.
What will that query do? It takes data from the Server and gives you the feedback submitted by the students so that you can check what the rating for the trainer is.
It took hardly 3.2 second for 13.9 MB of data for processing. If you have data in PB still it will not take more than a minute for execution.
The power of Transmit
You can use a Tab while writing any query as it will autocomplete your query. Also, while looking at the schema, you can click on the fields, and that will auto populate your query.
The power of supervision
Let’s execute any complex query and see the results. Omg.! It took more time? why? What happened with the system? Well don’t worry for this as we have super X-ray vision so we can monitor what BigQuery did in the background. Let’s look at the query history and then the execution details tab. It will show the process executed in the background.
Well if it is a little hard to read, we have another alternative. For example Legacy BQ web UI which has more compact results.
Here you can see the slowest operation were computing while reading all the row in the table
The power of Materialization
It’s really good to have these tables in BQ, but how did they load? Well they come periodically in new files into the Cloud storage, and then I read them raw into BQ. also, we can read them in JSON row in BQ.
Time Travel
If we delete the data from the tables and after some days you are missing that data again to recover them again. How will you recover them? Instead of writing as SELECT * From the table, we will use System Functions with function so that we can get what we are missing for.
The Power of super speed
How fast is Bigquery? Is it really?
Well the answer is Google internal implementation of the HyperLogLog algorithm. It lets BQ count uniques a lot faster than other DBs can do, and has some more great features which make BQ perform well.
Invulnerability
What makes the database worse is the error like what we get after division by zero, well we can avoid this by using the BQ expression as SAFE prefix.
If you begin a function with SAFE in front, it will return NULL instead of error.
How data gets migrated in BQ let follow the path for help so that we don’t divert the way.