Online courses

This course consists of two parts.The first part mainly describes what is big data and the opportunities and challenges we face in the age of big data. The second part describes the Huawei Kunpeng Big Data solution, including the Kunpeng server based on the Kunpeng chipset and HUAWEI CLOUD Kunpeng cloud services.

Big Data Development Trend and Kunpeng Big Data Solution

Big Data Development Trend and Kunpeng Big Data Solution

This course describes the big data distributed storage system HDFS and the ZooKeeper distributed service framework that resolves some frequently-encountered data management problems in distributed services.

HDFS and ZooKeeper

HDFS and ZooKeeper

The Apache Hive data warehouse software helps read, write, and manage large data sets that reside in distributed storage by using SQL. Structures can be projected onto stored data. The command line tool and JDBC driver are provided to connect users to Hive.

Hive - Distributed Data Warehouse

Hive - Distributed Data Warehouse

This course describes the non-relational distributed database called HBase in the Hadoop open-source community, which can meet the requirements of large-scale and real-time data processing applications.

HBase Technical Principles

HBase Technical Principles

This course describes MapReduce and YARN. MapReduce is the most famous computing framework for batch processing and offline processing in the big data field. YARN is the component responsible for unified resource management and scheduling in the Hadoop cluster.

MapReduce and YARN Technical Principles

MapReduce and YARN Technical Principles

This course describes the basic concepts of Spark and the similarities and differences between the Resilient Distributed Dataset (RDD), DataSet, and DataFrame data structures in Spark.

Spark - An In-Memory Distributed Computing Engine

Spark - An In-Memory Distributed Computing Engine

This course describes the core technologies and architecture, the time and window mechanisms and the fault tolerance mechanism of Flink.

Flink, Stream and Batch Processing in a Single Engine

Flink, Stream and Batch Processing in a Single Engine

Flume is an open-source, distributed, reliable, and highly available massive log aggregation system. It supports custom data transmitters for collecting data. It roughly processes data and writes data to data receivers.

Flume - Massive Log Aggregation

Flume - Massive Log Aggregation

Loader is used for efficient data import and export between the big data platform and structured data storage (such as relational databases). Based on the open-source Sqoop 1.99.x, Loader functions have been enhanced.

Loader Data Conversion

Loader Data Conversion

This chapter describes the basic concepts, architecture, and functions of Kafka. It is important to know how Kafka ensures reliability for data storage and transmission and how historical data is processed.

Kafka - Distributed Publish-Subscribe Messaging System

Kafka - Distributed Publish-Subscribe Messaging System

The in-depth development of big data open-source technologies cannot be achieved without the support of underlying platform technologies such as Hadoop. To manage the access control permission of data and resources in the cluster, Huawei big data platform implements a highly reliable cluster security mode based on LDAP and Kerberos and provides an integrated security authentication.

LDAP and Kerberos

LDAP and Kerberos

In recent years, Elasticsearch has developed rapidly and surpassed its original role as a search engine. It has added the features of data aggregation analysis and visualization. If you need to locate desired content using keywords in millions of documents, Elasticsearch is the best choice.

Elasticsearch - Distributed Search Engine

Elasticsearch - Distributed Search Engine

Redis is a network-based, high-performance key-value in-memory database which is frequently used in differently scenarios. This course talks about the related architecture and application scenarios of Redis.

Redis In-Memory Database

Redis In-Memory Database

This course mainly talks about the Huawei Big Data solution. This solution implements cross-cloud seamless synchronization of advanced service capabilities and multi-scenario collaboration, and supports Huawei Kunpeng and Ascend computing capabilities to help governments and enterprises realize refined resource control, cross-cloud hybrid orchestration, collaboration of multiple scenarios.

Huawei Big Data Solution

Huawei Big Data Solution

View More

Training the cloud talent of the future.

Learn More