2017 in Review
First off, this was an amazing year for Big Data, IoT, Streaming, Machine Learning and Deep Learning. So many cool events, updates, new products, new projects, new libraries and community growth. I've seen a lot of people adopt and grow Big Data and streaming projects from nothing. Using the power of Open Source and the tools made available by Apache, companies are growing with the help of trusted partners and a community of engineers and users.
We had three awesome DataWorksSummit (Formerly Hadoop Summit, but now a lot more things from IoT, AI and Streaming).
I attended Munich and spoke at Sydney. I missed California, but all the videos and slides were online and I loved those.
I spoke at Oracle Code in NYC which was a fun little event. I was surprised to learn that many people never heard of Apache NiFi or how easily you could use it to build real-time dataflows including Deep Learning and Big Data.
I got to talk to a lot of interesting people while working the Hortonworks Booth at Strata NYC. Such a huge event, fidget spinners and streaming were the main talk away there.
We had a lot of awesome meetups in Princeton and in the NYC and Philadelphia areas. The Princeton Future of Data Group grew to over 750 members! A great community of data scientists, engineers, students, analysts, techies and business thought leaders. I am really proud to be apart of this amazing group.
45570-timatoraclecode2.jpg
Meetups
I got to speak at most of the meetups except when we had special guests. I had some great NY/NJ/Philly team mates co-running the meetup: @milind pandit @greg Keys. Greg and I also created a North Jersey meetup.
November 14th - Enterprise Data at Scale
I spoke on IBM DSX, Apache NiFi, Apache Spark, Python, Jupyter and Data Science. We had two excellent IBM resources assisting me fortunately.
October 5th - Deep Learning with DeepLearning4J (DL4J). A great talk by my friend from SkyMind. It's nice to see their project get accepted to Eclipse.
August 8th - Deep Dive into HDF 3.0 @ Honeywell
June 20th - Latest Innovation -Schema Registry and More. @trac Intermodal
May 16th - Hadoop Tools Overview
March 28th - Apache NiFi: Ingesting Enterprise Data at Scale
45571-timspann2.jpg
Libraries, SDKs, Tools, Frameworks
TensorFlow
Apache MXNet
NLTK
Apache OpenNLP
Apache Tika
Apache NiFi Custom Processors
OpenCV
Apache NiFi 1.4
Apache Zeppelin
IBM DSX
Apache Spark 2.x
Apache Hive LLAP
Apache HBase with Apache Phoenix
Apache ORC
Apache Hadoop
Hortonworks Schema Registry
Hortonworks Streaming Analytics Manager
Druid
Apache SuperSet - Now in Apache
PyTorch
Apache Storm - Big Updates
Devices
Raspberry Pi Zero Wireless
Raspberry Pi 3B+
Movidius
Nvidia Jetson TX1
Matrix Creator
Google AIY Voice Kit
Kudrone
Christmas Tree Hat
Sense Hat
Many Cameras and Video Cameras
NanoPi Duo
Tinkerboard
There were a lot of big news this year, https://hortonworks.com/blog/top-hortonworks-blogs-2017/. Apache Hive LLAP became a real production thing and brought Apache Hadoop into the world of EDW completely Open Source. On the Apache Spark front, we past verison 2.0 and Livy became a production standby and became Apache Livy. The JanusGraph database appeared and is quickly becoming the standard for Graphs. Apache Calcite went into so many projects that SQL queries are everywhere including in Apache NiFi. A huge number of interesting software projects arrised including Hortonworks Data Plane, Hortonworks Schema Registry and Hortonworks Streaming Analytics Manager. This was an awesome year for software.
45567-christmastreerpi2.png
45568-matrixcreator1.jpg
Presentations From Talks Available
https://www.slideshare.net/bunkertor/enterprise-data-science-at-scale-princeton-nj-14nov2017
https://www.slideshare.net/bunkertor/realtime-ingesting-and-transforming-sensor-data-social-data-w-n...
https://www.slideshare.net/bunkertor/introduction-to-hdf-30
https://www.slideshare.net/bunkertor/introduction-to-hadoop-76031567
https://www.slideshare.net/bunkertor/apache-nifi-ingesting-enterprise-data-at-scale
https://www.slideshare.net/bunkertor/ingesting-drone-data-into-big-data-platforms
My HCC Articles of 2017
https://community.hortonworks.com/articles/80412/working-with-airbnbs-superset.html
https://community.hortonworks.com/articles/116803/building-a-custom-processor-in-apache-nifi-12-for....
https://community.hortonworks.com/articles/79842/ingesting-osquery-into-apache-phoenix-using-apache....
https://community.hortonworks.com/articles/97062/query-hive-using-python.html
https://community.hortonworks.com/articles/79008/using-the-hadoop-attack-library-to-check-your-hado....
https://community.hortonworks.com/articles/81222/adding-stanford-corenlp-to-big-data-pipelines-apac....
https://community.hortonworks.com/articles/81270/adding-stanford-corenlp-to-big-data-pipelines-apac-...
https://community.hortonworks.com/articles/88404/adding-and-using-hplsql-and-hivemall-with-hive-mac....
https://community.hortonworks.com/articles/149891/handling-hl7-records-and-storing-in-apache-hive-fo...
https://community.hortonworks.com/articles/87632/ingesting-sql-server-tables-into-hive-via-apache-n....
https://community.hortonworks.com/articles/73828/submitting-spark-jobs-from-apache-nifi-using-livy.h...
https://community.hortonworks.com/articles/76240/using-opennlp-for-identifying-names-from-text.html
https://community.hortonworks.com/articles/136024/integrating-nvidia-jetson-tx1-running-tensorrt-int...
https://community.hortonworks.com/articles/136026/integrating-nvidia-jetson-tx1-running-tensorrt-int...
https://community.hortonworks.com/articles/136028/integrating-nvidia-jetson-tx1-running-tensorrt-int...
https://community.hortonworks.com/articles/136039/integrating-nvidia-jetson-tx1-running-tensorrt-int...
https://community.hortonworks.com/articles/150026/hl7-processing-part-3-apache-zeppelin-sql-bi-and-a...
https://community.hortonworks.com/articles/104226/simple-backups-of-hadoop-with-apache-nifi-12.html
https://community.hortonworks.com/articles/77609/securing-your-clusters-in-the-public-cloud.html
https://community.hortonworks.com/articles/92495/monitor-apache-nifi-with-apache-nifi.html
https://community.hortonworks.com/articles/77621/creating-an-email-bot-in-apache-nifi.html
https://community.hortonworks.com/articles/80418/open-nlp-example-apache-nifi-processor.html
https://community.hortonworks.com/articles/76924/data-processing-pipeline-parsing-pdfs-and-identify....
https://community.hortonworks.com/articles/86801/working-with-s3-compatible-data-stores-via-apache.h...
https://community.hortonworks.com/articles/101904/part-2-iot-augmenting-gps-data-with-weather.html
https://community.hortonworks.com/articles/118148/creating-wordclouds-from-dataflows-with-apache-nif...
https://community.hortonworks.com/articles/121916/controlling-big-data-flows-with-gestures-minifi-ni...
https://community.hortonworks.com/articles/76935/using-sentiment-analysis-and-nlp-tools-with-hdp-25....
https://community.hortonworks.com/articles/87397/steganography-with-apache-nifi-1.html
https://community.hortonworks.com/articles/83100/deep-learning-iot-workflows-with-raspberry-pi-mqtt....
https://community.hortonworks.com/articles/154957/converting-json-to-sql-ddl.html
https://community.hortonworks.com/articles/81694/extracttext-nifi-custom-processor-powered-by-apach....
https://community.hortonworks.com/articles/92345/store-a-flow-to-disk-and-then-reserialize-it-to-co....
https://community.hortonworks.com/articles/92496/qadcdc-our-how-to-ingest-some-database-tables-to-h....
https://community.hortonworks.com/articles/73811/trigger-sonicpi-music-via-apache-nifi.html
https://community.hortonworks.com/articles/99861/ingesting-ibeacon-data-via-ble-to-mqtt-wifi-gatewa....
https://community.hortonworks.com/articles/101679/iot-ingesting-gps-data-from-raspberry-pi-zero-wire...
https://community.hortonworks.com/articles/104255/ingesting-and-testing-jms-data-with-nifi.html
https://community.hortonworks.com/articles/89455/ingesting-gps-data-from-onion-omega2-devices-with.h...
https://community.hortonworks.com/articles/89547/tracking-phone-location-for-android-and-iot-with-o....
https://community.hortonworks.com/articles/107379/minifi-for-image-capture-and-ingestion-from-raspbe...
https://community.hortonworks.com/articles/108947/minifi-for-ble-bluetooth-low-energy-beacon-data-in...
https://community.hortonworks.com/articles/108966/minifi-for-sensor-data-ingest-from-devices.html
https://community.hortonworks.com/articles/110469/simple-backup-and-restore-of-hdfs-data-via-hdf-30....
https://community.hortonworks.com/articles/110475/ingesting-sensor-data-from-raspberry-pis-running-r...
https://community.hortonworks.com/articles/118132/minifi-capturing-converting-tensorflow-inception-t...
https://community.hortonworks.com/articles/122077/ingesting-csv-data-and-pushing-it-as-avro-to-kafka...
https://community.hortonworks.com/articles/130814/sensors-and-image-capture-and-deep-learning-analys...
https://community.hortonworks.com/articles/86570/hosting-and-ingesting-data-from-web-pages-desktop.h...
https://community.hortonworks.com/articles/142686/real-time-ingesting-and-transforming-sensor-and-so...
https://community.hortonworks.com/articles/77988/ingest-remote-camera-images-from-raspberry-pi-via.h...
https://community.hortonworks.com/articles/108718/ingesting-rdbms-data-as-new-tables-arrive-automagi...
https://community.hortonworks.com/articles/149982/hl7-ingest-part-4-streaming-analytics-manager-and....
https://community.hortonworks.com/articles/149910/handling-hl7-records-part-1-hl7-ingest.html
https://community.hortonworks.com/articles/80339/iot-capturing-photos-and-analyzing-the-image-with.h...
https://community.hortonworks.com/articles/77403/basic-image-processing-and-linux-utilities-as-part....
https://community.hortonworks.com/articles/103863/using-an-asus-tinkerboard-with-tensorflow-and-pyth...
https://community.hortonworks.com/articles/146704/edge-analytics-with-nvidia-jetson-tx1-running-apac...
https://community.hortonworks.com/articles/148730/integrating-apache-spark-2x-jobs-with-apache-nifi....
https://community.hortonworks.com/articles/154760/generating-avro-schemas-and-ensuring-field-names-m...
https://community.hortonworks.com/articles/155326/monitoring-energy-usage-utilizing-apache-nifi-pyth...
My Articles on DZone
https://dzone.com/articles/generating-avro-schemas-and-ensuring-field-names-m
https://dzone.com/articles/favorite-tech-of-the-year-early-edition
https://dzone.com/articles/integrating-apache-spark-2x-jobs-with-apache-nifi
https://dzone.com/articles/using-jolt-in-big-data-streams-to-remove-nulls
https://dzone.com/articles/processing-hl7-records
https://dzone.com/articles/big-data-is-growing
https://dzone.com/articles/ingesting-rdbms-data-as-new-tables-arrive-automagi
https://dzone.com/articles/using-websockets-with-apache-nifi
https://dzone.com/articles/using-the-new-flick-hat-for-raspberry-pi
https://dzone.com/articles/real-time-ingest-and-ai
https://dzone.com/articles/tensorflow-for-real-world-applications
https://dzone.com/articles/integrating-nvidia-jetson-tx1-running-tensorrt-int
https://dzone.com/articles/real-time-tensorflow-camera-analysis-with-sensors
https://dzone.com/articles/tensorflow-and-nifi-big-data-ai-sandwich
https://dzone.com/articles/minifi-capturing-converting-tensorflow-inception-t
https://dzone.com/articles/creating-wordclouds-from-dataflows-with-apache-nif
https://dzone.com/articles/building-a-custom-processor-in-apache-nifi-12-for
https://dzone.com/articles/data-engineer-as-dj
https://dzone.com/articles/how-to-automatically-migrate-all-tables-from-a-dat
https://dzone.com/articles/dataworks-summit-2017-sj-updates
https://dzone.com/articles/hdf-30-for-utilities
https://dzone.com/articles/hdp-26-what-why-how-and-now
https://dzone.com/articles/using-apache-minifi-on-edge-devices-part-1
https://dzone.com/articles/creating-an-email-bot-in-apache-nifi
https://dzone.com/articles/this-week-in-hadoop-and-more-deep-deep-learning-an
https://dzone.com/articles/using-python-for-big-data-workloads-part-2
https://dzone.com/articles/using-tinkerboard-with-tensorflow-and-python
https://dzone.com/articles/using-python-for-big-data-workloads-part-1
https://dzone.com/articles/part-2-iot-augmenting-gps-data-with-weather
https://dzone.com/articles/this-week-in-hadoop-and-more-apache-calcite-kylin
https://dzone.com/articles/iot-ingesting-gps-data-from-raspberry-pi-zero-wire
https://dzone.com/articles/a-new-era-of-open-source-streaming
https://dzone.com/articles/day-1-dataworks-summit-munich-report
https://dzone.com/articles/this-week-in-hadoop-and-more-dl-conferences-course
https://dzone.com/articles/advanced-apache-nifi-flow-techniques
https://dzone.com/articles/a-big-data-reference-architecture-for-iot
https://dzone.com/articles/ingesting-gps-data-from-onion-omega2-devices-with-apache-nifi
https://dzone.com/articles/sentiment-shoot-out
https://dzone.com/articles/best-of-dataworks-summit-2017-munich
https://dzone.com/articles/deep-learning-on-big-data-platforms
https://dzone.com/articles/tensorflow-on-the-edge-part-2-of-5
https://dzone.com/articles/this-week-in-hadoop-and-more-nifi-drones-dataworks
https://dzone.com/articles/oracle-code-new-york-report
https://dzone.com/articles/deep-learning-for-data-engineers-part-1
https://dzone.com/articles/this-week-in-hadoop-and-more-keras-deep-learning-a
https://dzone.com/articles/happy-pi-day-2017
https://dzone.com/articles/deep-learning-and-machine-learning-guide-part-iii
https://dzone.com/articles/this-week-in-hadoop-and-more-deep-and-machine-lear
https://dzone.com/articles/backup-restore-dr
https://dzone.com/articles/big-data-performance-part-1
https://dzone.com/articles/nifi-spark-hbase-kafka-machine-learning-and-deep-l
https://dzone.com/articles/hadoop-101-hbase-client-access
https://dzone.com/articles/deep-learning-and-machine-learning-guide-part-ii
https://dzone.com/articles/this-week-in-hadoop-and-more-cloud-visualization-d
https://dzone.com/articles/big-data-ml-dl-command-line-tools
https://dzone.com/articles/machine-learning-resources
https://dzone.com/articles/tensorflow-on-the-edge
https://dzone.com/articles/deep-learning-and-machine-learning-killer-tools-li
https://dzone.com/articles/cool-projects-big-data-machine-learning-apache-nifi
https://dzone.com/articles/protect-your-cloud-big-data-assets
https://dzone.com/articles/edge-testing-your-hadoop-environment
https://dzone.com/articles/this-week-in-hadoop-and-more-6
https://dzone.com/articles/picamera-ingest-real-time
https://dzone.com/articles/this-week-in-hadoop-and-more-nlp-and-dl
https://dzone.com/articles/quick-tips-apache-phoenixhbase
https://dzone.com/articles/the-physics-of-big-data
My RefCard
https://dzone.com/refcardz/introduction-to-tensorflow
My Guide
https://dzone.com/guides/artificial-intelligence-machine-learning-and-predi
45572-tracmarch2017timtalks.png
My Github Source Code
I have some example Apache NiFi custom processors developed in JDK 8 including ones for TensorFlow, OpenNLP, DL4J, Apache Tika, Stanford CoreNLP and more. I also published all the Python scripts, documentation, Shell scripts, SQL, Apache NiFi Templates and Apache Zeppelin notebooks as Apache licensed open source on Github.
https://github.com/tspannhw/nifi-tensorflow-processor
https://github.com/tspannhw/nifi-nlp-processor
https://github.com/tspannhw/nifi-attributecleaner-processor
https://github.com/tspannhw/apachelivy-nifi-spark2-integration
https://github.com/tspannhw/nvidiajetsontx1-mxnet
https://github.com/tspannhw/nifi-dl4j-processor https://github.com/tspannhw/dws2017sydney
https://github.com/tspannhw/rpi-flickhat-minifi
https://github.com/tspannhw/rpi-rainbowhat
https://github.com/tspannhw/rpi-sensehat-minifi-python
https://github.com/tspannhw/rpizw-nifi-mqtt-gps
https://github.com/tspannhw/EnterpriseNIFI
https://github.com/tspannhw/IngestingDroneData
https://github.com/tspannhw/spy
https://github.com/tspannhw/webdataingest
https://github.com/tspannhw/mxnet_rpi
https://github.com/tspannhw/nifi-extracttext-processor
https://github.com/tspannhw/nifi-corenlp-processor
https://github.com/tspannhw/nlp-utilities
https://github.com/tspannhw/rpi-sensehat-mqtt-nifi
https://github.com/tspannhw/rpi-picamera-mqtt-nifi
https://github.com/tspannhw/iot-scripts
https://github.com/tspannhw/phoenix
https://github.com/tspannhw/hive
45569-googlevoicekit.jpg
Next year will be amazing, more libraries, more use cases for Deep Learning, enhancements to all the great projects and tools out there. Another Google AIY Kit, more DataWorks Summits, Hadoop 3, HDF 4, HDP 3, so many things to look forward to.
See you at meetups, summits and online next year.
Top comments (0)