Loading…
PrestoCon Day 2022 has ended
Virtual Event | Thursday, July 21, 2022
View More Details & Registration

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PrestoCon Day 2022 to participate in the sessions.

Please note: This schedule is automatically displayed in Pacific Daylight Time. To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Thursday, July 21 • 11:00am - 11:10am
Query Execution Optimization for Broadcast Join using Replicated-Reads Strategy - George Wang, Ahana

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Today presto supports broadcast join by having a worker to fetch data from a small data source to build a hash table and then sending the entire data over the network to all other workers for hash lookup probed by large data source. This can be optimized by a new query execution strategy as source data from small tables is pulled directly by all workers which is known as replicated reads from dimension tables. This feature comes with a nice caching property given that all worker nodes N are now participating in scanning the data from remote sources. The table scan operation for dimension tables is cacheable per all worker nodes. In addition, there will be better resource utilization because the presto scheduler can now reduce the number plan fragment to execute as the same workers run tasks in parallel within a single stage to reduce data shuffles.

Speakers
avatar for George Wang

George Wang

Principal Software Engineer, Ahana Cloud
George Wang is the principal software engineer at Ahana Cloud. His primary focus is query performance optimization. Prior to that George worked at Alibaba Cloud where he built numerous query execution optimizations to support a high qps and low-latency compute engine for AnalyticDB... Read More →



Thursday July 21, 2022 11:00am - 11:10am PDT
Virtual