10
10.0
Feb 26, 2021
02/21
by
apache
software
eye 10
favorite 0
comment 0
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby. Apache Arrow Powering In-Memory Analytics Apache Arrow is a...
Topics: GitHub, code, software, git
7
7.0
Nov 14, 2021
11/21
by
apache
software
eye 7
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
41
41
Oct 5, 2017
10/17
by
apache
software
eye 41
favorite 0
comment 0
Mirror of Apache Flink Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flink at http://flink.apache.org/ Features A streaming-first runtime that supports both batch processing and data streaming programs Elegant and fluent APIs in Java and Scala A runtime that supports very high throughput and low event latency at the same time Support for event time and out-of-order processing in the DataStream...
Topics: GitHub, code, software, git
6
6.0
Sep 10, 2021
09/21
by
apache
software
eye 6
favorite 0
comment 0
Apache Doris (Incubating) Apache Doris (incubating) Doris is an MPP-based interactive SQL data warehousing for reporting and analysis.Its original name was Palo, developed in Baidu. After donated to Apache Software Foundation, it was renamed Doris. Official website: https://doris.apache.org/ 1. License Apache License, Version 2.0 2. Technology Doris mainly integrates the technology of Google Mesa and Apache Impala, and it is based on a column-oriented storage engine and can communicate by MySQL...
Topics: GitHub, code, software, git
31
31
Jun 19, 2017
06/17
by
apache
software
eye 31
favorite 0
comment 0
Mirror of Apache Beam Apache Beam Apache Beam is a unified model for defining both batch and streaming data-parallel processing pipelines, as well as a set of language-specific SDKs for constructing pipelines and Runners for executing them on distributed processing backends, including Apache Apex , Apache Flink , Apache Spark , and Google Cloud Dataflow . Status Overview Beam provides a general approach to expressing embarrassingly parallel data processing pipelines and supports three...
Topics: GitHub, code, software, git
19
19
Sep 19, 2019
09/19
by
apache
software
eye 19
favorite 0
comment 0
Apache DolphinScheduler Easy Scheduler Easy Scheduler for Big Data Design features: A distributed and easy-to-expand visual DAG workflow scheduling system. Dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing.Its main objectives are as follows: Associate the Tasks according to the dependencies of the tasks in a DAG graph, which can visualize the running state of task in real time. Support for many task types: Shell,...
Topics: GitHub, code, software, git
48
48
Aug 23, 2019
08/19
by
apache
software
eye 48
favorite 0
comment 0
Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application Superset Apache Superset (incubating) is a modern, enterprise-readybusiness intelligence web application [this project used to be named Caravel , and Panoramix in the past] Screenshots & Gifs View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of visualizations Apache Superset Apache Superset is a...
Topics: GitHub, code, software, git
9
9.0
May 11, 2021
05/21
by
apache
software
eye 9
favorite 0
comment 0
Apache Pulsar - distributed pub-sub messaging system Pulsar is a distributed pub-sub messaging platform with a veryflexible messaging model and an intuitive client API. Learn more about Pulsar at https://pulsar.apache.org Main features Horizontally scalable (Millions of independent topics and millionsof messages published per second) Strong ordering and consistency guarantees Low latency durable storage Topic and queue semantics Load balancer Designed for being deployed as a hosted service:...
Topics: GitHub, code, software, git
20
20
Jan 25, 2021
01/21
by
apache
software
eye 20
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
31
31
Jun 29, 2020
06/20
by
apache
software
eye 31
favorite 0
comment 0
Distributed scheduled job framework Elastic-Job - distributed scheduled job solution Homepage 中文主页 Elastic-Job-Lite Console Overview Elastic-Job is a distributed scheduled job solution. Elastic-Job is composited from 2 independent sub projects: Elastic-Job-Lite and Elastic-Job-Cloud. Elastic-Job-Lite is a centre-less solution, use lightweight jar to coordinate distributed jobs. Elastic-Job-Lite and Elastic-Job-Cloud provide unified API. Developers only need code one time, then decide to...
Topics: GitHub, code, software, git
21
21
Apr 23, 2020
04/20
by
apache
software
eye 21
favorite 0
comment 0
Distributed database middleware ShardingSphere - Distributed Database Middleware Ecosphere Official website: https://shardingsphere.apache.org/ Document Overview ShardingSphere is an open-source ecosystem consisted of a set of distributed database middleware solutions, including 2 independent products, Sharding-JDBC & Sharding-Proxy & Sharding-Sidecar (todo). They all provide functions of data sharding, distributed transaction and database orchestration, applicable in a variety of...
Topics: GitHub, code, software, git
6
6.0
Jan 2, 2022
01/22
by
apache
software
eye 6
favorite 0
comment 0
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time). SeaTunnel SeaTunnel was formerly named Waterdrop , and renamed SeaTunnel since October 12, 2021. SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-timesynchronization of massive data. It can synchronize tens of billions of data stably and efficiently every day, and hasbeen used...
Topics: GitHub, code, software, git
22
22
Apr 30, 2020
04/20
by
apache
software
eye 22
favorite 0
comment 0
Apache Flink Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flink at https://flink.apache.org/ Features A streaming-first runtime that supports both batch processing and data streaming programs Elegant and fluent APIs in Java and Scala A runtime that supports very high throughput and low event latency at the same time Support for event time and out-of-order processing in the DataStream API, based...
Topics: GitHub, code, software, git
25
25
Jan 26, 2021
01/21
by
apache
software
eye 25
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
20
20
Apr 11, 2021
04/21
by
apache
software
eye 20
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
14
14
Nov 22, 2019
11/19
by
apache
software
eye 14
favorite 0
comment 0
Apache ServiceComb Pack is an eventually data consistency solution for micro-service applications. ServiceComb Pack currently provides TCC and Saga distributed transaction co-ordination solutions by using Alpha as a transaction coordinator and Omega as an transaction agent . Pack | 中文 Apache ServiceComb Pack is an eventually data consistency solution for micro-service applications. Features High availability. The coordinator is stateless and thus can have multiple instances. High...
Topics: GitHub, code, software, git
21
21
Mar 3, 2022
03/22
by
apache
software
eye 21
favorite 0
comment 0
The Cloud-Native API Gateway Apache APISIX Apache APISIX is a dynamic, real-time, high-performance API gateway. APISIX provides rich traffic management features such as load balancing, dynamic upstream, canary release, circuit breaking, authentication, observability, and more. You can use Apache APISIX to handle traditional north-south traffic,as well as east-west traffic between services.It can also be used as a k8s ingress controller . The technical architecture of Apache APISIX: Community...
Topics: GitHub, code, software, git
30
30
Apr 28, 2020
04/20
by
apache
software
eye 30
favorite 0
comment 0
Apache Flink Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flink at https://flink.apache.org/ Features A streaming-first runtime that supports both batch processing and data streaming programs Elegant and fluent APIs in Java and Scala A runtime that supports very high throughput and low event latency at the same time Support for event time and out-of-order processing in the DataStream API, based...
Topics: GitHub, code, software, git
64
64
Jan 27, 2021
01/21
by
apache
software
eye 64
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
14
14
Apr 12, 2021
04/21
by
apache
software
eye 14
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
7
7.0
Jan 12, 2022
01/22
by
apache
software
eye 7
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Why Superset? Superset is a modern data exploration and data visualization platform. Superset can replace or augment proprietary business intelligence tools for many teams. Superset provides:...
Topics: GitHub, code, software, git
4
4.0
Jun 22, 2022
06/22
by
apache
software
eye 4
favorite 0
comment 0
Apache HBase To restore the repository download the bundle wget https://archive.org/download/github.com-apache-hbase_-_2022-06-22_12-01-56/apache-hbase_-_2022-06-22_12-01-56.bundle and run: git clone apache-hbase_-_2022-06-22_12-01-56.bundle Source: https://github.com/apache/hbase Uploader: apache Upload date: 2022-06-22
Topics: GitHub, code, software, git
25
25
May 27, 2017
05/17
by
apache
software
eye 25
favorite 0
comment 0
Download the bundle apache-incubator-singa_-_2017-05-27_07-00-07.bundle and run: git clone apache-incubator-singa_-_2017-05-27_07-00-07.bundle -b master Mirror of Apache Singa (Incubating) Apache SINGA Distributed deep learning system http://singa.apache.org Quick Start Installation Examples Issues JIRA tickets Mailing Lists Development Mailing List ( Archive ) Commits Mailing List ( Archive ) Source: https://github.com/apache/incubator-singa Uploader: apache Upload date: 2017-05-27
Topics: GitHub, code, software, git
25
25
Jan 24, 2021
01/21
by
apache
software
eye 25
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
2
2.0
Sep 25, 2021
09/21
by
apache
software
eye 2
favorite 0
comment 0
Apache ECharts is a powerful, interactive charting and data visualization library for browser Apache ECharts Apache ECharts is a free, powerful charting and visualization library offering an easy way of adding intuitive, interactive, and highly customizable charts to your commercial products. It is written in pure JavaScript and based on zrender , which is a whole new lightweight canvas library. 中文官网 | ENGLISH HOMEPAGE Get Apache ECharts You may choose one of the following methods:...
Topics: GitHub, code, software, git
22
22
Apr 9, 2021
04/21
by
apache
software
eye 22
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
33
33
Oct 16, 2019
10/19
by
apache
software
eye 33
favorite 0
comment 0
Sharding-Sphere examples ShardingSphere-example Example for 1.x please see tags in https://github.com/apache/incubator-shardingsphere/tree/${tag}/sharding-jdbc-example Example for 2.x or 3.x or 4.x please see tags in https://github.com/apache/incubator-shardingsphere-example/tree/${tag} Need attention Please do not use dev branch to run your example, example of dev branch is not released yet. The manual schema initial script is in...
Topics: GitHub, code, software, git
50
50
Jun 19, 2017
06/17
by
apache
software
eye 50
favorite 0
comment 0
Mirror of Apache Hive Apache Hive (TM) The Apache Hive (TM) data warehouse software facilitates reading,writing, and managing large datasets residing in distributed storageusing SQL. Built on top of Apache Hadoop (TM), it provides: Tools to enable easy access to data via SQL, thus enabling datawarehousing tasks such as extract/transform/load (ETL), reporting,and data analysis A mechanism to impose structure on a variety of data formats Access to files stored either directly in Apache HDFS (TM)...
Topics: GitHub, code, software, git
71
71
Nov 11, 2020
11/20
by
apache
software
eye 71
favorite 0
comment 0
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows Apache Airflow Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are defined as code, they become more maintainable, versionable, testable, and collaborative. Use Airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler executes your tasks on an array of workers while following the specified...
Topics: GitHub, code, software, git
3
3.0
Sep 2, 2021
09/21
by
apache
software
eye 3
favorite 0
comment 0
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box. Dolphin Scheduler Official Website dolphinscheduler.apache.org Design Features DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data...
Topics: GitHub, code, software, git
8
8.0
Mar 12, 2022
03/22
by
apache
software
eye 8
favorite 0
comment 0
Apache Arrow DataFusion and Ballista query engines DataFusion DataFusion is an extensible query execution framework, written inRust, that uses Apache Arrow as itsin-memory format. DataFusion supports both an SQL and a DataFrame API for buildinglogical query plans as well as a query optimizer and execution enginecapable of parallel execution against partitioned data sources (CSVand Parquet) using threads. DataFusion also supports distributed query execution via the Ballista crate. Use Cases...
Topics: GitHub, code, software, git
4
4.0
May 15, 2021
05/21
by
apache
software
eye 4
favorite 0
comment 0
Apache Thrift Apache Thrift Introduction Thrift is a lightweight, language-independent software stack forpoint-to-point RPC implementation.Thrift provides clean abstractions and implementations for data transport,data serialization, and application level processing. The code generationsystem takes a simple definition language as input and generates codeacross programming languages that uses the abstracted stack to buildinteroperable RPC clients and servers. Thrift makes it easy for programs...
Topics: GitHub, code, software, git
8
8.0
May 19, 2021
05/21
by
apache
software
eye 8
favorite 0
comment 0
Apache Spark - A unified analytics engine for large-scale data processing Apache Spark Spark is a unified analytics engine for large-scale data processing. It provideshigh-level APIs in Scala, Java, Python, and R, and an optimized engine thatsupports general computation graphs for data analysis. It also supports arich set of higher-level tools including Spark SQL for SQL and DataFrames,MLlib for machine learning, GraphX for graph processing,and Structured Streaming for stream processing....
Topics: GitHub, code, software, git
13
13
Jan 26, 2021
01/21
by
apache
software
eye 13
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
35
35
Sep 25, 2019
09/19
by
apache
software
eye 35
favorite 0
comment 0
Apache Spark Apache Spark Spark is a unified analytics engine for large-scale data processing. It provideshigh-level APIs in Scala, Java, Python, and R, and an optimized engine thatsupports general computation graphs for data analysis. It also supports arich set of higher-level tools including Spark SQL for SQL and DataFrames,MLlib for machine learning, GraphX for graph processing,and Structured Streaming for stream processing. https://spark.apache.org/ Online Documentation You can find the...
Topics: GitHub, code, software, git
34
34
Jun 19, 2017
06/17
by
apache
software
eye 34
favorite 0
comment 0
Mirror of Apache Accumulo Apache Accumulo is a sorted, distributed key/value store thatprovides robust, scalable data storage and retrieval. Apache Accumulo is based on Google's [BigTable] design and is built on top of Apache[Hadoop], [Zookeeper], and [Thrift]. It has several novel [features] such as cell-basedaccess control and a server-side programming mechanism that can modify key/value pairsat various points in the data management process. Installation Follow these instructions to install...
Topics: GitHub, code, software, git
10
10.0
Mar 14, 2022
03/22
by
apache
software
eye 10
favorite 0
comment 0
Apache Arrow DataFusion and Ballista query engines DataFusion DataFusion is an extensible query execution framework, written inRust, that uses Apache Arrow as itsin-memory format. DataFusion supports both an SQL and a DataFrame API for buildinglogical query plans as well as a query optimizer and execution enginecapable of parallel execution against partitioned data sources (CSVand Parquet) using threads. DataFusion also supports distributed query execution via the Ballista crate. Use Cases...
Topics: GitHub, code, software, git
11
11
Jan 25, 2021
01/21
by
apache
software
eye 11
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
55
55
Jun 30, 2020
06/20
by
apache
software
eye 55
favorite 0
comment 0
Distributed scheduled job framework Elastic-Job - distributed scheduled job solution Homepage 中文主页 Elastic-Job-Lite Console Overview Elastic-Job is a distributed scheduled job solution. Elastic-Job is composited from 2 independent sub projects: Elastic-Job-Lite and Elastic-Job-Cloud. Elastic-Job-Lite is a centre-less solution, use lightweight jar to coordinate distributed jobs. Elastic-Job-Lite and Elastic-Job-Cloud provide unified API. Developers only need code one time, then decide to...
Topics: GitHub, code, software, git
16
16
Apr 10, 2021
04/21
by
apache
software
eye 16
favorite 0
comment 0
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`. Dolphin Scheduler Official Website dolphinscheduler.apache.org Design Features DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the...
Topics: GitHub, code, software, git
20
20
Jun 23, 2020
06/20
by
apache
software
eye 20
favorite 0
comment 0
Apache Spark - A unified analytics engine for large-scale data processing Apache Spark Spark is a unified analytics engine for large-scale data processing. It provideshigh-level APIs in Scala, Java, Python, and R, and an optimized engine thatsupports general computation graphs for data analysis. It also supports arich set of higher-level tools including Spark SQL for SQL and DataFrames,MLlib for machine learning, GraphX for graph processing,and Structured Streaming for stream processing....
Topics: GitHub, code, software, git
3
3.0
May 7, 2022
05/22
by
apache
software
eye 3
favorite 0
comment 0
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data. Apache Camel Apache Camel is an Open Source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data. Introduction Camel empowers you to define routing and mediation rules in a variety of domain-specific languages (DSL, such as Java, XML, Groovy, Kotlin, and YAML). This means you get smart...
Topics: GitHub, code, software, git
3
3.0
Aug 23, 2021
08/21
by
apache
software
eye 3
favorite 0
comment 0
Apache Arrow DataFusion and Ballista query engines DataFusion DataFusion is an extensible query execution framework, written inRust, that uses Apache Arrow as itsin-memory format. DataFusion supports both an SQL and a DataFrame API for buildinglogical query plans as well as a query optimizer and execution enginecapable of parallel execution against partitioned data sources (CSVand Parquet) using threads. DataFusion also supports distributed query execution via the Ballista crate. Use Cases...
Topics: GitHub, code, software, git
6
6.0
Jan 12, 2022
01/22
by
apache
software
eye 6
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Why Superset? Superset is a modern data exploration and data visualization platform. Superset can replace or augment proprietary business intelligence tools for many teams. Superset provides:...
Topics: GitHub, code, software, git
12
12
Jan 13, 2022
01/22
by
apache
software
eye 12
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Why Superset? Superset is a modern data exploration and data visualization platform. Superset can replace or augment proprietary business intelligence tools for many teams. Superset provides:...
Topics: GitHub, code, software, git
65
65
Sep 15, 2019
09/19
by
apache
software
eye 65
favorite 0
comment 0
Mirror of Apache NiFi Apache NiFi is an easy to use, powerful, andreliable system to process and distribute data. Table of Contents Features Requirements Getting Started Getting Help Documentation License Export Control Features Apache NiFi was made for dataflow. It supports highly configurable directed graphs of data routing, transformation, and system mediation logic. Some of its key features include: Web-based user interface Seamless experience for design, control, and monitoring...
Topics: GitHub, code, software, git
6
6.0
Feb 25, 2022
02/22
by
apache
software
eye 6
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Why Superset? Superset is a modern data exploration and data visualization platform. Superset can replace or augment proprietary business intelligence tools for many teams. Superset provides:...
Topics: GitHub, code, software, git
3
3.0
Mar 13, 2022
03/22
by
apache
software
eye 3
favorite 0
comment 0
Apache Arrow DataFusion and Ballista query engines DataFusion DataFusion is an extensible query execution framework, written inRust, that uses Apache Arrow as itsin-memory format. DataFusion supports both an SQL and a DataFrame API for buildinglogical query plans as well as a query optimizer and execution enginecapable of parallel execution against partitioned data sources (CSVand Parquet) using threads. DataFusion also supports distributed query execution via the Ballista crate. Use Cases...
Topics: GitHub, code, software, git
17
17
May 10, 2021
05/21
by
apache
software
eye 17
favorite 0
comment 0
Apache Pulsar - distributed pub-sub messaging system Pulsar is a distributed pub-sub messaging platform with a veryflexible messaging model and an intuitive client API. Learn more about Pulsar at https://pulsar.apache.org Main features Horizontally scalable (Millions of independent topics and millionsof messages published per second) Strong ordering and consistency guarantees Low latency durable storage Topic and queue semantics Load balancer Designed for being deployed as a hosted service:...
Topics: GitHub, code, software, git
7
7.0
Dec 30, 2021
12/21
by
apache
software
eye 7
favorite 0
comment 0
Apache Maven Daemon To restore the repository download the bundle wget https://archive.org/download/github.com-apache-maven-mvnd_-_2021-12-29_18-27-47/apache-maven-mvnd_-_2021-12-29_18-27-47.bundle and run: git clone apache-maven-mvnd_-_2021-12-29_18-27-47.bundle Source: https://github.com/apache/maven-mvnd Uploader: apache Upload date: 2021-12-29
Topics: GitHub, code, software, git
33
33
Oct 5, 2017
10/17
by
apache
software
eye 33
favorite 0
comment 0
Mirror of Apache Lucene + Solr Apache Lucene and Solr Apache Lucene is a high-performance, full featured text search engine librarywritten in Java. Apache Solr is an enterprise search platform written using Apache Lucene.Major features include full-text search, index replication and sharding, andresult faceting and highlighting. Online Documentation This README file only contains basic setup instructions. For morecomprehensive documentation, visit: Lucene:...
Topics: GitHub, code, software, git
28
28
Dec 26, 2019
12/19
by
apache
software
eye 28
favorite 0
comment 0
Upserts And Incremental Processing on Big Data Apache Hudi (Incubating) Apache Hudi (Incubating) (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals . Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). https://hudi.apache.org/ Features Upsert support with fast, pluggable indexing Atomically publish data with rollback support Snapshot isolation between writer & queries Savepoints for data recovery...
Topics: GitHub, code, software, git
9
9.0
Jan 5, 2022
01/22
by
apache
software
eye 9
favorite 0
comment 0
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time). SeaTunnel SeaTunnel was formerly named Waterdrop , and renamed SeaTunnel since October 12, 2021. SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-timesynchronization of massive data. It can synchronize tens of billions of data stably and efficiently every day, and hasbeen used...
Topics: GitHub, code, software, git
11
11
Jan 14, 2022
01/22
by
apache
software
eye 11
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Why Superset? Superset is a modern data exploration and data visualization platform. Superset can replace or augment proprietary business intelligence tools for many teams. Superset provides:...
Topics: GitHub, code, software, git
33
33
Jan 25, 2021
01/21
by
apache
software
eye 33
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
27
27
Apr 22, 2020
04/20
by
apache
software
eye 27
favorite 0
comment 0
Distributed database middleware ShardingSphere - Distributed Database Middleware Ecosphere Official website: https://shardingsphere.apache.org/ Document Overview ShardingSphere is an open-source ecosystem consisted of a set of distributed database middleware solutions, including 2 independent products, Sharding-JDBC & Sharding-Proxy & Sharding-Sidecar (todo). They all provide functions of data sharding, distributed transaction and database orchestration, applicable in a variety of...
Topics: GitHub, code, software, git
29
29
Jan 23, 2021
01/21
by
apache
software
eye 29
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Screenshots & Gifs Gallery View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of...
Topics: GitHub, code, software, git
22
22
May 4, 2020
05/20
by
apache
software
eye 22
favorite 0
comment 0
Apache Flink Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flink at https://flink.apache.org/ Features A streaming-first runtime that supports both batch processing and data streaming programs Elegant and fluent APIs in Java and Scala A runtime that supports very high throughput and low event latency at the same time Support for event time and out-of-order processing in the DataStream API, based...
Topics: GitHub, code, software, git
4
4.0
Feb 10, 2022
02/22
by
apache
software
eye 4
favorite 0
comment 0
Mirror of Apache RocketMQ Apache RocketMQ Apache RocketMQ is a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability. It offers a variety of features: Messaging patterns including publish/subscribe, request/reply and streaming Financial grade transactional message Built-in fault tolerance and high availability configuration options base on DLedger A variety of cross language clients, such as Java, C/C++...
Topics: GitHub, code, software, git
25
25
May 2, 2020
05/20
by
apache
software
eye 25
favorite 0
comment 0
Apache Flink Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flink at https://flink.apache.org/ Features A streaming-first runtime that supports both batch processing and data streaming programs Elegant and fluent APIs in Java and Scala A runtime that supports very high throughput and low event latency at the same time Support for event time and out-of-order processing in the DataStream API, based...
Topics: GitHub, code, software, git
31
31
Oct 5, 2017
10/17
by
apache
software
eye 31
favorite 0
comment 0
Mirror of Apache Hadoop ZooKeeper Apache ZooKeeper For the latest information about Apache ZooKeeper, please visit our website at: http://zookeeper.apache.org/ and our wiki, at: https://cwiki.apache.org/confluence/display/ZOOKEEPER Full documentation for this release can also be found in docs/index.html Packaging/release artifacts The release artifact contains the following jar file at the top level: zookeeper-.jar - legacy jar file which contains all classes and source files. Prior to version...
Topics: GitHub, code, software, git
5
5.0
Aug 22, 2021
08/21
by
apache
software
eye 5
favorite 0
comment 0
Apache Arrow DataFusion and Ballista query engines DataFusion DataFusion is an extensible query execution framework, written inRust, that uses Apache Arrow as itsin-memory format. DataFusion supports both an SQL and a DataFrame API for buildinglogical query plans as well as a query optimizer and execution enginecapable of parallel execution against partitioned data sources (CSVand Parquet) using threads. DataFusion also supports distributed query execution via the Ballista crate. Use Cases...
Topics: GitHub, code, software, git
2
2.0
May 8, 2022
05/22
by
apache
software
eye 2
favorite 0
comment 0
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data. Apache Camel Apache Camel is an Open Source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data. Introduction Camel empowers you to define routing and mediation rules in a variety of domain-specific languages (DSL, such as Java, XML, Groovy, Kotlin, and YAML). This means you get smart...
Topics: GitHub, code, software, git
7
7.0
Mar 25, 2022
03/22
by
apache
software
eye 7
favorite 0
comment 0
Upserts, Deletes And Incremental Processing on Big Data. Apache Hudi Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals . Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). https://hudi.apache.org/ Features Upsert support with fast, pluggable indexing Atomically publish data with rollback support Snapshot isolation between writer & queries Savepoints for data recovery Manages file...
Topics: GitHub, code, software, git
33
33
Feb 19, 2018
02/18
by
apache
software
eye 33
favorite 0
comment 0
Apache OpenWhisk is a serverless event-based programming service and an Apache Incubator project. OpenWhisk OpenWhisk is a cloud-first distributed event-based programming service. It provides a programming model to upload event handlers to a cloud service, and register the handlers to respond to various events. Learn more at http://openwhisk.incubator.apache.org . Quick Start (Vagrant) Native development (Mac and Ubuntu) Learn concepts and commands Issues Slack Quick Start A Vagrant machine is...
Topics: GitHub, code, software, git
23
23
Jul 20, 2019
07/19
by
apache
software
eye 23
favorite 0
comment 0
Mirror of Apache Kafka Apache Kafka See our web site for details on the project. You need to have Gradle and Java installed. Kafka requires Gradle 5.0 or higher. Java 8 should be used for building in order to support both Java 8 and Java 11 at runtime. Scala 2.12 is used by default, see below for how to use a different Scala version or all of the supported Scala versions. First bootstrap and download the wrapper cd kafka_source_dirgradle Now everything else will work. Build a jar and run it...
Topics: GitHub, code, software, git
5
5.0
Jan 13, 2022
01/22
by
apache
software
eye 5
favorite 0
comment 0
Apache Superset is a Data Visualization and Data Exploration Platform Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Release Notes | Get Involved | Contributor Guide | Resources | Organizations Using Superset Why Superset? Superset is a modern data exploration and data visualization platform. Superset can replace or augment proprietary business intelligence tools for many teams. Superset provides:...
Topics: GitHub, code, software, git
32
32
Jun 29, 2019
06/19
by
apache
software
eye 32
favorite 0
comment 0
Mirror of Apache Lucene + Solr Apache Lucene and Solr Apache Lucene is a high-performance, full featured text search engine librarywritten in Java. Apache Solr is an enterprise search platform written using Apache Lucene.Major features include full-text search, index replication and sharding, andresult faceting and highlighting. Online Documentation This README file only contains basic setup instructions. For morecomprehensive documentation, visit: Lucene:...
Topics: GitHub, code, software, git
33
33
Jul 1, 2017
07/17
by
apache
software
eye 33
favorite 0
comment 0
Mirror of Apache RocketMQ Apache RocketMQ Apache RocketMQ is a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability. It offers a variety of features: Pub/Sub messaging model Scheduled message delivery Message retroactivity by time or offset Log hub for streaming Big data integration Reliable FIFO and strict ordered messaging in the same queue Efficient pull&push consumption model Million-level...
Topics: GitHub, code, software, git
30
30
Dec 25, 2019
12/19
by
apache
software
eye 30
favorite 0
comment 0
Upserts And Incremental Processing on Big Data Apache Hudi (Incubating) Apache Hudi (Incubating) (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals . Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). https://hudi.apache.org/ Features Upsert support with fast, pluggable indexing Atomically publish data with rollback support Snapshot isolation between writer & queries Savepoints for data recovery...
Topics: GitHub, code, software, git
19
19
Apr 11, 2021
04/21
by
apache
software
eye 19
favorite 0
comment 0
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`. Dolphin Scheduler Official Website dolphinscheduler.apache.org Design Features DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the...
Topics: GitHub, code, software, git
72
72
Aug 24, 2019
08/19
by
apache
software
eye 72
favorite 0
comment 0
Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application Superset Apache Superset (incubating) is a modern, enterprise-readybusiness intelligence web application [this project used to be named Caravel , and Panoramix in the past] Screenshots & Gifs View Dashboards Slice & dice your data Query and visualize your data with SQL Lab Visualize geospatial data with deck.gl Choose from a wide array of visualizations Apache Superset Apache Superset is a...
Topics: GitHub, code, software, git
27
27
Aug 16, 2019
08/19
by
apache
software
eye 27
favorite 0
comment 0
APM, Application Performance Monitoring System Apache SkyWalking SkyWalking : an APM(application performance monitor) system, especially designed formicroservices, cloud native and container-based (Docker, Kubernetes, Mesos) architectures. Abstract SkyWalking is an open source APM system, including monitoring, tracing, diagnosing capabilities for distributed systemin Cloud Native architecture.The core features are following. Service, service instance, endpoint metrics analysis Root cause...
Topics: GitHub, code, software, git
19
19
Apr 21, 2020
04/20
by
apache
software
eye 19
favorite 0
comment 0
Distributed database middleware ShardingSphere - Distributed Database Middleware Ecosphere Official website: https://shardingsphere.apache.org/ Document Overview ShardingSphere is an open-source ecosystem consisted of a set of distributed database middleware solutions, including 2 independent products, Sharding-JDBC & Sharding-Proxy & Sharding-Sidecar (todo). They all provide functions of data sharding, distributed transaction and database orchestration, applicable in a variety of...
Topics: GitHub, code, software, git
21
21
Apr 11, 2021
04/21
by
apache
software
eye 21
favorite 0
comment 0
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`. Dolphin Scheduler Official Website dolphinscheduler.apache.org Design Features DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the...
Topics: GitHub, code, software, git