[2022 AWS re:Invent] Unlock the value of your data with AWS analytics

세션 유형

Leadership

세션명

Unlock the value of your data with AWS analytics

강연자

Scott Donaldson, CTO, FINRA CAT
Saurabh Mehta, Sr Engineering Manager, Airbnb
G2 Krishnamoorthy, VP, Data Lakes & Analytics, Amazon

세션요약자

박위철(Wicheol Park)

핵심내용 요약

데이터는 디지털 혁신을 촉진하고 효과적인 비즈니스 의사 결정을 주도
끊임없이 변화하는 세상에서 살아남기 위해 조직은 현재와 미래에 관련성을 유지할 수 있도록 통찰력을 얻고 새로운 경험을 만들고 스스로를 재창조하기 위해 데이터로 전환
AWS는 조직이 모든 데이터에서 더 빠르고 심층적인 통찰력을 얻을 수 있도록 하는 분석 서비스를 제공

키워드

Data Silo, Governance, Data Journey,
2. Zero ETL, Ray, FINRA, AWS Lake Formation, Amazon DataZone, Amazon Athena for Apache Spark, Amazon EMR

상세내용

<Data 분석을 통한 가치 창조 현황>

오늘날 데이터의 90%가 지난 2년 동안에 생성
다만 데이터 생성 기업의 32% 만이 그들의 데이터로 부터 가치를 실현

<AWS가 Data 가치를 획득하게 하기 위해 더 쉽게 만들기 위해 4가지 방향으로 접근>

SCALABLE : Performance at scale
UNIFIED : Connect all your data
COMPREHENSIVE : Tools for all your data
GOVERNED : End-to-End governance

<데이터 활용의 중요성>

데이터는 디지털 시대의 혁신을 촉진
성공할 조직은 데이터를 사용하는 조직이 될 것

<데이터 활용을 위한 방향>

Breaking down silos
Build a data-driven organization
- Data is an organizational asset
- Data is accessible and secure
- Data is put to work

<<<Make data work for you>>>

Systems : 현대적인 아키텍처를 갖추고 종단간 데이터 활용 지원
Data : 분석을 위해 데이터 사일로와 타사 데이터를 모두 분해하고 기계학습시스템과 연결
People : 산업 및 지리적 규정을 준수하기 위한 포괄적 거버넌스 전략 제공
관련 고객 사례 – JPMorgan, ENGIE

<<Data Journey>>

<<ETL is hard>>

<AWS ETL tool : Glue> – Simple, scalable, and serverless

Integrate data faster
Automate at scale
No servers to manage
Built on Spark and Python

<역할별 GLUE 기능>

ETL Developer : Glue Studio
Business Analyst : Glue Databrew
Data Engineer : Glue Notebooks

(New Service 1 : AWS Glue for Ray) – in Preview
; Easy-to-scale Ray open-source Python framework, made serverless

Scale existing Python code in AWS Glue for Ray
No Infrastructure management or tuning
Cost-Effective to scale with pay-as-you-go billing

(New Service 2 : Amazon Redshift streaming ingestion) – GA
; Ingest streaming data into your data warehouse for real-time analytics

(New Service 3 : Amazon Aurora zero-ETL to Amazon Redshift) – in Preview
; Boost analytical query performance

No ETL pipelines,
Real-time Analytics and machine learning on transactional data
Consolidated Insights from multiple Aurora databases

<AWS의 Application 데이터 수집 지원 서비스>

SaaS를 비롯한 어플리케이션으로 부터 데이터를 수집할 필요가 있음
Amazon AppFlow
- Automate data flows by securely integrating third-party applications & AWS services, without code
- 사용하기 쉽고, 비용 효과적이며 운영 오버헤드를 줄여줌. 보안과 확장성에도 장점
- SAP, Salesforce, ServiceNow, Facebook 등 약 50여개의 Saas 어플리케이션 지원
AWS Data Exchange
; AWS Data Exchange is fully integrated with Amazon Redshift data sharing
- Find and subscribe to third-party data
- No data copies or movement
- Automated Licensing and contracting
- Secure and governed collaboration with your partners

<<Breaking down data silos 지원 및 data work 지원 서비스>>

Amazon Redshift: Your data warehouse foundation
(New Service 4 – GA) Amazon Redshift integration for Apache Spark
Amazon S3: Your storage foundation for data lakes
Broad support for transactional data lakes
Governance is Key – Unified governance for analytics and machine learning
주요 관련 서비스
- AWS Lake Formation
- Amazon DataZone (coming soon)
- Amazon Athena for Apache Spark (New – GA)
- Amazon EMR, Amazon OpenSearch Service
- Amazon OpenSearch Serverless (New – Preview)
- Amazon MSK and Kinesis Data Analytics

<Streaming 데이터를 위한 서비스 혁신>

Amazon MSK tiered storage
Kinesis data streams data viewer

<Real-time data analytics use cases>

Customer 360, Personalized customer experiences, Real-time notifications,
More accurate ML models, Anomaly and fraud detection, Predictive maintenance,
IoT and asset management, Security and threat management

< ML 서비스 혁신>

<Data Journey를 위한 AWS의 프로그램 및 Certification>

Bespin’s Comment

Amazon은 데이터를 활용하는 있어서 고객의 부서별 페르소나별 다양성한 요구사항을 만족하기 위한 End-to-End Data Journey 를 지원하기 위한 서비스를 신규 혹은 Update를 통해 제공
기존에 사용하던 서비스들의 성능과 기능뿐만 아니라 사용 편이성도 크게 개선
데이터 활용 측면에서 현재 사용중인 서비스에서 부족했던 기능이나 혹은 제약사항으로 인하여 서비스 사용을 미루었던 서비스의 Cloud 재사용 검토 필요
베스핀글로벌을 통해 신규로 추가된 서비스에 대한 PoC 나 기술 소개 가능

세션 유형

세션명

강연자

세션요약자

핵심내용 요약

키워드

상세내용

Bespin’s Comment

이 글 공유하기:

관련글