2023/08/31 |
Delta |
|
What is the Delta Lake Transaction Log? |
2023/08/25 |
Spark, Delta |
|
Why Structured Streaming and Delta Lake for Batch ETL? |
2023/07/27 |
LLMs |
|
Quick Start with llama.cpp with Llama 2 and Macbook M2 Air |
2023/06/29 |
LLMs, Spark |
Databricks |
Introducing English as the New Programming Language for Apache Spark |
2023/06/29 |
Delta Lake |
Databricks |
Announcing Delta Lake 3.0 with New Universal Format and Liquid Clustering |
2023/06/26 |
LLMs |
site |
LLM Avalanche: Over 40 speakers and 900 people attended this LLM conference-within-a-conference to kick start Data + AI Summit 2023 |
2023/03/20 |
Spark, Delta |
|
Why does altering a Delta Lake table schema not show up in the Spark DataFrame? |
2022/12/13 |
Delta Lake |
delta.io |
Building a more efficient data infrastructure for machine learning with Open Source using Delta Lake, Amazon SageMaker, and EMR |
2022/11/10 |
community |
Integration Developer News |
How Developers Can Manage and Contribute to Successful Open-Source Projects |
2022/08/11 |
Delta Lake |
delta.io |
Apache Flink Source Connector for Delta Lake tables |
2022/08/02 |
Delta Lake |
delta.io |
Delta 2.0 - The Foundation of your Data Lakehouse is Open |
2022/06/15 |
Databricks |
Databricks |
Defining the Future of Data & AI: Announcing the Finalists for the 2022 Databricks Data Team OSS Award |
2022/05/18 |
Delta Lake |
delta.io |
Multi-cluster writes to Delta Lake Storage in S3 |
2022/05/05 |
Delta Lake |
delta.io |
Delta Lake 1.2 - More Speed, Efficiency and Extensibility Than Ever |
2022/04/27 |
Delta Lake |
delta.io |
Writing to Delta Lake from Apache Flink |
2022/03/24 |
Delta Lake, Trino |
Starburst |
Starburst and Databricks Collaborate on the Trino Delta Lake Connector |
2022/03/16 |
Delta Lake |
Databricks |
Extending Delta Sharing to Google Cloud Storage |
2022/03/12 |
Delta Lake, PrestoDB |
PrestoDB |
Native Delta Lake Connector for Presto |
2022/01/31 |
Delta Lake |
Databricks |
Make Your Data Lakehouse Run, Faster With Delta Lake 1.1 |
2022/01/28 |
Delta Lake |
Databricks |
The Ubiquity of Delta Standalone: Java, Scala, Hive, Presto, Trino, Power BI, and More! |
2022/01/21 |
Delta Lake |
Databricks |
Extending Delta Sharing for Azure |
2021/12/01 |
Delta Lake |
Databricks |
The Foundation of Your Lakehouse Starts With Delta Lake |
2021/04/23 |
podcasts |
Databricks |
How We Launched a Podcast: Lessons, (Minor) Mishaps & Key Takeaways |
2021/04/21 |
Delta Lake |
Databricks |
Attack of the Delta Clones (Against Disaster Recovery Availability Complexity) |
2021/02/10 |
Delta Lake |
Databricks |
Automatically Evolve Your Nested Column Schema, Stream From a Delta Table Version, and Check Your Constraints |
2020/12/22 |
Delta Lake |
Databricks |
Natively Query Your Delta Lake With Scala, Java, and Python |
2020/11/20 |
Delta Lake |
Databricks |
How Scribd Uses Delta Lake to Enable the World’s Largest Digital Library |
2020/09/29 |
Delta Lake |
Databricks |
Diving Into Delta Lake: DML Internals (Update, Delete, Merge) |
2020/08/27 |
Delta Lake |
Databricks |
Enabling Spark SQL DDL and DML in Delta Lake on Apache Spark 3.0 |
2020/06/18 |
Delta Lake |
Databricks |
Time Traveling with Delta Lake: A Retrospective of the Last Year |
2020/05/19 |
Delta Lake |
Databricks |
Schema Evolution in Merge Operations and Operational Metrics in Delta Lake |
2020/04/14 |
health |
Databricks |
COVID-19 Datasets Now Available on Databricks: How the Data Community Can Help |
2020/01/29 |
Delta Lake |
Databricks |
Query Delta Lake Tables from Presto and Athena, Improved Operations Concurrency, and Merge performance |
2019/11/05 |
ML |
Databricks |
Using AutoML Toolkit’s FamilyRunner Pipeline APIs to Simplify and Automate Loan Default Predictions |
2019/10/03 |
Delta Lake |
Databricks |
Simple, Reliable Upserts and Deletes on Delta Lake Tables using Python APIs |
2019/09/24 |
Delta Lake |
Databricks |
Diving Into Delta Lake: Schema Enforcement & Evolution |
2019/09/10 |
ML |
Databricks |
Using AutoML Toolkit to Automate Loan Default Predictions |
2019/08/21 |
Delta Lake |
Databricks |
Diving Into Delta Lake: Unpacking The Transaction Log |
2019/08/14 |
Delta Lake, ML |
Databricks |
Productionizing Machine Learning with Delta Lake |
2019/06/18 |
Delta Lake, Streaming |
Databricks |
Simplifying Streaming Stock Analysis using Delta Lake and Apache Spark: On-Demand Webinar and FAQ Now Available! |
2019/05/02 |
ML |
Databricks |
Detecting Financial Fraud at Scale with Decision Trees and MLflow on Databricks |
2019/04/30 |
ML, MLflow |
Databricks |
Using Dynamic Time Warping and MLflow to Detect Sales Trends |
2019/04/30 |
ML, MLflow |
Databricks |
Understanding Dynamic Time Warping |
2018/11/13 |
ML |
Databricks |
Applying your Convolutional Neural Network: On-Demand Webinar and FAQ Now Available! |
2018/10/29 |
Delta |
Databricks |
Simplifying Change Data Capture with Databricks Delta |
2018/10/22 |
ML |
Databricks |
Training your Neural Network: On-Demand Webinar and FAQ Now Available! |
2018/10/03 |
ML, MLflow |
Databricks |
MLflow v0.7.0 Features New R API by RStudio |
2018/10/01 |
ML |
Databricks |
Introduction to Neural Networks: On-Demand Webinar and FAQ Now Available! |
2018/09/18 |
ML, Spark |
Databricks |
Simplify Market Basket Analysis using FP-growth on Databricks |
2018/09/13 |
ML |
Databricks |
Identify Suspicious Behavior in Video with Databricks Runtime for Machine Learning |
2018/09/12 |
ML, MLflow |
Databricks |
MLflow On-Demand Webinar and FAQ Now Available! |
2018/09/09 |
Delta Lake |
Databricks |
Building a Real-Time Attribution Pipeline with Databricks Delta |
2018/09/09 |
ML |
Databricks |
Loan Risk Analysis with XGBoost and Databricks Runtime for Machine Learning |
2018/08/08 |
MLflow |
Databricks |
MLflow 0.4.2 Released |
2018/07/19 |
Spark |
Databricks |
Simplify Advertising Analytics Click Prediction with Databricks Unified Analytics Platform |
2018/07/19 |
Spark, Delta |
Databricks |
Simplify Streaming Stock Data Analysis Using Databricks Delta |
2018/07/19 |
Streaming, Spark, Delta |
Databricks |
Make Your Oil and Gas Assets Smarter by Implementing Predictive Maintenance with Databricks |
2018/07/09 |
Spark |
Databricks |
Analyze Games from European Soccer Leagues with Apache Spark and Databricks |
2018/07/02 |
Spark, Streaming |
Databricks |
Build a Mobile Gaming Events Data Pipeline with Databricks Delta |
2018/06/27 |
R |
Databricks |
Announcing RStudio and Databricks Integration |
2017/11/07 |
CosmosDB |
github |
Lambda Architecture with Azure Cosmos DB and HDInsight (Apache Spark) |
2017/07/01 |
Spark |
O’Reilly |
Introduction to Apache Spark 2.0 |
2017/02/18 |
Spark |
book |
Learning PySpark: Build data-intensive applications locally and deploy at scale using the combined powers of Python and Spark 2.0 |
2016/12/02 |
Spark |
github |
How Apache Spark performs a fast count using the parquet metadata |
2016/06/30 |
Spark |
Databricks |
Introducing Getting Started with Apache Spark on Databricks |
2016/06/22 |
Spark |
Databricks, KDNuggets |
Apache Spark Key Terms, Explained |
2016/06/08 |
Spark |
Databricks |
Another Record-Setting Spark Summit |
2016/05/28 |
Spark |
|
On-Time Flight Performance with GraphFrames for Apache Spark |
2016/05/24 |
Spark, Genomics |
Databricks |
Predicting Geographic Population using Genome Variants and K-Means |
2016/05/24 |
Spark, Genomics |
Databricks |
Parallelizing Genome Variant Analysis |
2016/05/24 |
Spark, Genomics |
Databricks |
Genome Sequencing in a Nutshell |
2016/03/16 |
Spark, Graph |
Databricks |
On-Time Flight Performance with GraphFrames for Apache Spark |
2016/02/11 |
Spark, ML |
InfoWorld |
Why you should use Spark for machine learning |
2016/02/11 |
Spark |
|
Presentation: Jump Start into Apache® Spark™ 2.0 |
2016/02/02 |
Spark |
Databricks |
An Illustrated Guide to Advertising Analytics |
2015/12/19 |
community |
Databricks |
Databricks launches Meetup-in-a-box for Apache Spark Meetup Organizers |
2015/11/09 |
Spark |
insideBIGDATA |
Apache Spark is the Smartphone of Big Data |
2015/09/24 |
Spark |
Databricks |
Spark Survey 2015 Results are now available |
2015/08/31 |
Spark |
Databricks |
Data Exploration with Databricks |
2015/06/09 |
Spark |
Databricks |
Introduction to Databricks |
2015/06/04 |
Spark, ML |
Databricks |
Simplify Machine Learning on Apache Spark with Databricks |
2014/01/06 |
HDFS, pig |
|
Quick Tip for Compressing Many Small Text Files within HDFS via Pig |
2013/09/30 |
SSAS |
|
Analysis Services Multidimensional: It is the Order of Things |
2013/05/14 |
random |
|
In the context of quantum entanglement and time travel – Stargate may be more correct than Star Trek |
2013/04/26 |
Hive |
|
Optimizing Joins running on HDInsight Hive on Azure at GFS |
2013/03/18 |
blob |
|
Why use Blob Storage with HDInsight on Azure |
2013/03/12 |
Avro, Hadoop |
|
Using Avro with HDInsight on Azure at 343 Industries |
2013/02/04 |
Spark |
|
Installing Spark 0.6.1 Standalone on OSX Mountain Lion (10.8) |
2012/12/03 |
Hadoop, pig |
|
Getting your Pig to eat ASV blobs in Windows Azure HDInsight |
2012/09/26 |
SSAS, Hive |
Microsoft |
SQL Server Analysis Services to Hive (backup) |
2012/09/03 |
random |
|
In the context of quantum entanglement and teleportation – Stargate may be more correct than Star Trek |
2012/06/28 |
SSAS |
Microsoft |
Microsoft SQL Server Analysis Services Multidimensional Performance and Operations Guide |
2012/05/08 |
Hadoop |
|
Installing Hadoop on OSX Lion (10.7) |
2012/03/01 |
Hadoop, BI |
|
BI and Big Data–the best of both worlds! |
2012/02/17 |
Hadoop, JS |
|
Hadoop JavaScript– Microsoft’s VB shift for Big Data |
2012/01/31 |
big data |
|
Moving data to compute or compute to data? That is the Big Data question |
2012/01/24 |
big data |
|
Scale Up or Scale Out your Data Problems? A Space Analogy |
2012/01/21 |
PowerPivot, Hadoop |
|
Connecting PowerPivot to Hadoop on Azure – Self Service BI to Big Data in the Cloud |
2012/01/12 |
Hadoop, Azure |
|
A funky way to do Hive and Hadoop … on Azure |
2011/12/15 |
Hadoop, Azure |
|
An Azure Elephant Never Forgets… |
2011/10/01 |
MS-SQL |
Microsoft |
SQL Server 2008 R2: Analysis Services Performance Guide (backup) |
2010/12/10 |
MS-SQL |
Microsoft |
Measuring and Understanding the Performance of Your SSIS Packages in the Enterprise (SQL Server Video) |
2010/07/01 |
MS-SQL |
Microsoft |
Analysis Services ROLAP for SQL Server Data Warehouses (backup) |
2010/06/01 |
MS-SQL |
Microsoft |
Scale-Out Querying for Analysis Services with Read-Only Databases (backup) |
2009/12/22 |
Healthcare |
book |
Transforming Health Care Through Information: Case Studies (Health Informatics) |
2009/12/16 |
MS-SQL |
book |
Professional Microsoft SQL Server Analysis Services 2008 with MDX |
2009/05/12 |
MS-SQL |
Microsoft |
Disk Partition Alignment Best Practices for SQL Server |
2008/11/05 |
MS-SQL |
Microsoft |
Reaching Compliance: SQL Server 2008 Compliance Guide (backup) |
2008/04/17 |
MS-SQL |
Microsoft |
Analysis Services Distinct Count Optimization (backup) |
2007/09/24 |
Privacy |
|
Analyzing Data while Protecting Privacy – A Differential Privacy Case Study |
2007/09/01 |
MS-SQL |
Microsoft |
SQL Server 2005: Precision Considerations for Analysis Services Users (backup) |
2006/03/02 |
Research |
paper (acknowledgement) |
Early establishment of a pool of latently infected, resting CD4+ T cells during primary HIV-1 infection |
2001/10/01 |
MS-SQL |
book |
Professional SQL Server 2000 Data Warehousing with Analysis Services |