apache/spark
Apache Spark - A unified analytics engine for large-scale data processing
joernio/joern
Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. Discord https://discord.gg/vv4MH284Hc
delta-io/delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
apache/kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
rtyley/bfg-repo-cleaner
Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
apache/incubator-gluten
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
apache/pekko
Build highly concurrent, distributed, and resilient message-driven applications using Java/Scala
scala/scala
Scala 2 compiler and standard library. Scala 2 bugs at https://github.com/scala/bug; Scala 3 at https://github.com/scala/scala3
zio/zio
ZIO — A type-safe, composable library for async and concurrent programming in Scala
apache/incubator-livy
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
awslabs/deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
com-lihaoyi/mill
Mill is a build tool for Java, Scala and Kotlin: 3-6x faster than Maven or Gradle, less fiddling with plugins, and more easily explorable in your IDE
playframework/playframework
The Community Maintained High Velocity Web Framework For Java and Scala.
hyperledger-labs/splice
Reference applications for funding, operating, and incentivizing the use of a decentralized, public Canton synchronizer. Includes the Amulet reference application for creating native payment utilities for Canton synchronizers and Daml applications.