Tutorial: Creating HDFS Snapshots And Recovering a Deleted File
In this tutorial, we focus on HDFS snapshots. Common use cases of HDFS snapshots include backups and protection against user errors. To demonstrate functionality of HDFS snapshots, we create an...
View ArticleSlides: Introduction to HCatalog
We are happy to share slides about HCatalog that come from Data Analyst Training delivered by GetInData. HCatalog allows users with different data processing tools (such as Apache Hive, Apache Pig,...
View ArticleSlides: Quick Introduction to Apache Tez
We share our slides about Apache Tez delivered by our consultant as a lightening talk given at Warsaw Hadoop User Group. Tez is a highly efficient and scaleable execution engine that can be easily...
View ArticleBig Data Technology Summit – first truly technical Big Data conference in...
We are excited to announce that GetInData became a co-organizer (together with Evention) of Big Data Technology Summit. Big Data Technology Summit is first truly technical Big Data conference in...
View ArticleTutorial: Using Presto to combine data from Hive and MySQL in one SQL-like query
Presto (originated at Facebook) is a yet another distributed SQL query engine for Hadoop that has recently generated huge excitement. What makes Presto so interesting, especially, in comparison to...
View ArticleSurprising Sqoop-to-Hive Gotchas
In this blog post, I describe a few surprising gotchas related to the import of a MySQL table into Hive using Sqoop 1.4.5 (the most recent version supported by vendors like Hortonworks or Cloudera at...
View ArticleAvoiding The Mess In The Hadoop Cluster (Part 1)
This blog series is based on the talk “Simplified Data Management and Process Scheduling in Hadoop” that we gave at Big Data Technical Conference in Poland in February 2015. Because the talk was very...
View ArticleRecent Evolution of Zero Data Loss Guarantee in Spark Streaming With Kafka
When properly deployed, Spark Streaming 1.2 provides zero data loss guarantee. To enjoy this mission-critical feature, you need to fulfil following prerequisites: The input data comes from reliable...
View ArticleAvoiding The Mess In The Hadoop Cluster (Part 2)
Przepraszamy, ten wpis jest dostępny tylko w języku English.
View ArticleGetInData oficjalnym sponsorem Warsaw Hadoop User Group!
Z wielką przyjemnością chcieliśmy poinformować, że GetInData została oficjalnym sponsorem Warsaw Hadoop User Group! W ramach sponsoringu, GetInData będzie pokrywała wszystkie stałe koszta...
View Article(English) Big Data Weekly Quiz #1
Przepraszamy, ten wpis jest dostępny tylko w języku English.
View Article(English) Big Data Weekly Quiz #3
Przepraszamy, ten wpis jest dostępny tylko w języku English.
View Article(English) Big Data Weekly Quiz #4
Przepraszamy, ten wpis jest dostępny tylko w języku English.
View ArticleSlides: Introduction to HCatalog
We are happy to share slides about HCatalog that come from Data Analyst Training delivered by GetInData. HCatalog allows users with different data processing tools (such as Apache Hive, Apache Pig,...
View ArticleSlides: Quick Introduction to Apache Tez
We share our slides about Apache Tez delivered by our consultant as a lightening talk given at Warsaw Hadoop User Group. Tez is a highly efficient and scaleable execution engine that can be easily...
View ArticleBig Data Technology Summit – first truly technical Big Data conference in...
We are excited to announce that GetInData became a co-organizer (together with Evention) of Big Data Technology Summit. Big Data Technology Summit is first truly technical Big Data conference in...
View Article(English) Big Data Weekly Quiz #6
Przepraszamy, ten wpis jest dostępny tylko w języku English.
View Article(English) Big Data Weekly Quiz #7
Przepraszamy, ten wpis jest dostępny tylko w języku English.
View Article