What method should be used to query datasets stored in different formats and services in near real-time?

Boost your AWS Data Analytics knowledge with flashcards and multiple choice questions, including hints and explanations. Prepare for success!

The best method for querying datasets stored in different formats and services in near real-time is to use Amazon Kinesis Data Analytics with Apache Presto. This choice is correct because Amazon Kinesis Data Analytics is designed specifically for real-time processing of streaming data, enabling users to analyze data as it arrives. With the integration of Apache Presto, it allows for querying across diverse data sources and formats, such as data lakes, databases, or streaming data.

Presto is a distributed SQL query engine designed for analytical queries on large datasets and can handle different data sources seamlessly. As a result, it is particularly suitable for scenarios where you need to execute complex queries across a variety of data stored in different locations, making it perfect for near real-time analytics.

Using AWS Glue for data transformation and aggregation is more geared towards data preparation and ETL (Extract, Transform, Load) processes rather than immediate querying, which is why it does not fit the requirement for real-time querying. Amazon DMS is focused on data migration rather than on querying data, making it unsuitable for the real-time query requirement as well. Performing queries in Amazon ECM, on the other hand, does not exist in the scope of AWS services and would not serve the purpose of querying datasets in real

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy