The number of data centers around the world for various data services is increasing, and the large amount of metadata from systems, services, and users is resulting in many problems for maintenance and operation management. If each system in each data center separately builds and manages metadata itself, the availability, scalability, and applicability of data services, especially from authorized third parties, is reduced and ineffective, so it is useful to build a unified metadatabase with supporting secure access control and system-reflection management across distributed data centers.
Minh Chau Nguyen and Hee Sun Won explore the integrated metadata management feature of the geographically distributed Hadoop ecosystem and describe an implementation that allows multiple users to securely access the metadata and supports reflecting changes in runtime to specific systems with a flexible schema management mechanism over geographically distributed data centers. Along the way, Minh and Hee Sun reveal the main requirements and challenges in building this platform, explain details of the design, and compare it with existing approaches.
The main features of this unified metadatabase are as follows:
Minh Chau Nguyen is a researcher in the smart data platform research department at the Electronic and Telecommunications Research Institute (ETRI). His research interests include big data management, software architecture, and distributed systems.
Heesun Won is a principal researcher at the Electronic and Telecommunications Research Institute (ETRI), where she has been developing an open data reference model and data distribution system with semantic data map—SODAS: Smart Open Data as a System. Her research interests include software architecture for big data processing in cloud environments.
©2016, O'Reilly Media, Inc. • (800) 889-8969 or (707) 827-7019 • Monday-Friday 7:30am-5pm PT • All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. • email@example.com
Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.