AWS Launches Amazon Redshift Spectrum

New capability allows Amazon Redshift customers to run analytic queries quickly and inexpensively against exabytes of data in Amazon S3

SEATTLE--()--Today, Amazon Web Services, Inc. (AWS), an Amazon.com company (NASDAQ: AMZN), announced Amazon Redshift Spectrum, a new feature that allows Amazon Redshift customers to run SQL queries against exabytes of their data in Amazon Simple Storage Service (Amazon S3). With Redshift Spectrum, customers can extend the analytic power of Amazon Redshift beyond data stored on local disks in their data warehouse to query vast amounts of unstructured data in their Amazon S3 “data lake” – without having to load or transform any data. Redshift Spectrum applies sophisticated query optimization, scaling processing across thousands of nodes so results are fast – even with large data sets and complex queries. To get started with Redshift Spectrum, visit https://aws.amazon.com/redshift/spectrum.

Amazon Redshift is one of AWS’s fastest-growing services because it allows customers to perform complex queries on petabytes of structured data stored on high-performance local disks and get superfast performance – all for a tenth of the cost of traditional data warehouses. However, as the cost of data storage has continued to drop, customers are increasingly storing vast amounts of data in Amazon S3 “data lakes,” including unstructured data that may never make it into a data warehouse. Now, with Redshift Spectrum, analyzing all of this data is as easy as running a standard Amazon Redshift SQL query. Redshift Spectrum directly queries data in Amazon S3, with no loading or transformation required, using the open data formats customers already use, including CSV, TSV, Parquet, Sequence, and RCFile. Since Redshift Spectrum supports the same SQL syntax of Amazon Redshift, customers can run sophisticated queries using the same Business Intelligence (BI) tools they do today. They can also run queries that span both the frequently accessed data stored locally in Amazon Redshift and their full data sets stored cost-effectively in Amazon S3. Redshift Spectrum automatically scales query compute capacity based on the data being retrieved, so queries against Amazon S3 data run fast, whether processing just a few terabytes, petabytes, or even exabytes.

Customers such as Amgen, Boingo Wireless, Electronic Arts, Hearst, Lyft, Nasdaq, Scholastic, TripAdvisor, and Yahoo! are migrating to Amazon Redshift in droves because it leverages the scale of AWS to analyze petabytes of data with ten times the performance at one-tenth the cost of old guard data warehouses. Many of these customers have asked us to extend the speed and flexibility of Amazon Redshift beyond the data warehouse to analyze all of the data they have in Amazon S3,” said Raju Gulabani, Vice President, Databases, Analytics, and AI, AWS. “Redshift Spectrum does just this, offering the best of both worlds by making it incredibly easy to query exabytes of data in Amazon S3 directly from Amazon Redshift. We’re excited to now make exabyte-scale analytics fast, simple and accessible to companies of all sizes.”

Tokyo-based NTT DOCOMO is the largest mobile service provider in Japan, serving more than 68 million customers. “Our data analysis platform collects tens of terabytes of log data each day from a variety of internal and external sources to help us improve our logistics and marketing operations. Migrating to Amazon Redshift two years ago allowed us to scale to over ten petabytes of uncompressed data with a ten times performance improvement over our prior on-premises system,” said Mick Etoh, Senior Vice President and General Manager of Innovation Management Department, NTT DOCOMO. “Redshift Spectrum will let us expand the universe of the data we analyze to 100s of petabytes over time. This is truly a game changer, and we can think of no other system in the world that can get us there.”

Time Inc. is a leading content company that engages over 150 million consumers every month through its portfolio of premium brands across platforms. “As a media company, we receive a large quantity of data from a number of ad serving providers. This data comes in a variety of formats and needs to be integrated with our own internal systems in order for our teams to be able to analyze it,” said Vladimir Barkov, Director of Data Architecture and Engineering at Time Inc. “Redshift Spectrum enables us to directly operate on our data in its native format in Amazon S3 with no preprocessing or transformation. Our data pipeline is much simpler now, and our execution time has been lowered significantly.”

Edmunds offers detailed, constantly updated information about vehicles to 20 million monthly visitors. “Amazon Redshift’s scalability allows us to support our ever-growing data volumes, unlike our previous, on-premises data warehouse solution,” said Ajit Zadgaonkar, Edmunds’s Executive Director of Operations and Infrastructure. “With Redshift Spectrum, we no longer need to think about what data to retain for analysis and what to throw away. We can now run real SQL queries directly on many years of data stored cost-effectively in Amazon S3. Redshift Spectrum’s fast performance across massive data sets is unprecedented.”

Redfin is the next-generation real estate brokerage, combining its own full-service agents with modern technology to redefine real estate in the consumer’s favor across more than 80 metros in the US. “With millions of users and hundreds of millions of property listings, our website and internal systems generate a vast amount of data. Our data analytics platform has been built from the ground up on AWS, using Amazon S3 for storage, Amazon Kinesis for streaming, Amazon EMR for data processing and real-time applications, and Amazon Redshift for data warehousing,” said Yong Huang, Director of Big Data and Analytics at Redfin. “We love Redshift Spectrum because it allows us to directly and flexibly query our most up-to-date data coming from many different complex pipelines in many different file formats. Our data science team using Amazon EMR can now collaborate with our marketing and product teams using Redshift Spectrum to analyze the same Amazon S3 data sets.”

Yelp connects people with great local businesses and provides them with in-depth reviews across 32 countries. “Yelp generates billions of analytics events every day across our 24 million average monthly mobile app unique users, 65 million average monthly mobile web unique visitors, and 73 million average monthly desktop unique visitors as of December 31, 2016. Our shift to mobile has stressed our analytics infrastructure, as our mobile app users have ten times more engagement than our website users,” said Justin Cunningham, Technical Lead in the Software Engineering team at Yelp. “Redshift Spectrum unlocks analytic access to our Amazon S3 data, reducing the time-to-insight across large data sets to seconds. It will enable many more use cases than we serve today – multiple teams can now query the same Amazon S3 data sets using both Amazon Redshift and Amazon EMR.”

Recruit Technologies operates some of the most popular media brands and advertising properties in Japan. “Our web and mobile properties that are provided by Recruit’s subsidiaries generate billions of events per day which we analyze to improve our business, including marketing, business planning, and product enhancements. We migrated to Amazon Redshift in 2015 to keep up with the explosion of data in our business,” said Satoshi Honmura, Group Manager of Data Management at Recruit Technologies. “Redshift Spectrum will help us scale yet further while also lowering our costs. Now, our data scientists can run sophisticated queries against many years of historical data in Amazon S3, paying just for the queries they run, while our hundreds of business users can continue to use Redshift local storage to deliver blazingly fast performance against more recent data.”

Customers can start using Redshift Spectrum using the AWS Management Console. Amazon Redshift Spectrum is available in the US East (N. Virginia), US East (Ohio), and US West (Oregon) Regions and will expand to additional Regions in the coming months.

About Amazon Web Services

For 11 years, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud platform. AWS offers over 90 fully featured services for compute, storage, networking, database, analytics, application services, deployment, management, developer, mobile, Internet of Things (IoT), Artificial Intelligence (AI), security, hybrid, and enterprise applications, from 42 Availability Zones (AZs) across 16 geographic regions in the U.S., Australia, Brazil, Canada, China, Germany, India, Ireland, Japan, Korea, Singapore, and the UK. AWS services are trusted by millions of active customers around the world – including the fastest growing startups, largest enterprises, and leading government agencies – to power their infrastructure, make them more agile, and lower costs. To learn more about AWS, visit https://aws.amazon.com.

About Amazon

Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. Customer reviews, 1-Click shopping, personalized recommendations, Prime, Fulfillment by Amazon, AWS, Kindle Direct Publishing, Kindle, Fire tablets, Fire TV, Amazon Echo, and Alexa are some of the products and services pioneered by Amazon. For more information, visit www.amazon.com/about and follow @AmazonNews.

Contacts

Amazon.com, Inc.
Media Hotline
Amazon-pr@amazon.com
www.amazon.com/pr

Release Summary

AWS announced Amazon Redshift Spectrum, a new feature that allows Amazon Redshift customers to run SQL queries against exabytes of their data in Amazon S3.

Contacts

Amazon.com, Inc.
Media Hotline
Amazon-pr@amazon.com
www.amazon.com/pr