Versioning, Syncing & Streaming Large Datasets Using DAT + Node

Versioning, Syncing & Streaming Large Datasets Using DAT + Node

JSConf via YouTube Direct link

we want to do for data what git did for source code

10 of 29

10 of 29

we want to do for data what git did for source code

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Versioning, Syncing & Streaming Large Datasets Using DAT + Node

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro
  2. 2 dat is an open source tool for sharing and collaborating on data
  3. 3 analogy time: lets talk about source control
  4. 4 life before git
  5. 5 1. somehow get a zip of cool-project 2. unpack and edit a file 3. email the file back 4. ????
  6. 6 maintainer creates new zip of cool-project that might contain my fix
  7. 7 claim: currently data sharing is a mess
  8. 8 email csv files
  9. 9 database dumps in git
  10. 10 we want to do for data what git did for source code
  11. 11 npm install -g dat
  12. 12 max, import your genome into dat
  13. 13 data is stored locally in leveldb blobs are stored in blob-stores
  14. 14 choose the blob store that fits your use case s3, local-fs
  15. 15 auto schema generation - free REST API - *all* APIs are streaming
  16. 16 a data set we can all relate
  17. 17 calculate how big npm is using dat
  18. 18 dat cat transform
  19. 19 dat cat docker run-i transform
  20. 20 transform the npm data using bulk-markdown-to-png
  21. 21 use case: trillian astronomical
  22. 22 1. full sky scans 2. detect objects
  23. 23 problems: huge files, weird format
  24. 24 1TB gzipped CSVS 600 million objects, 300 columns 40TB imagery
  25. 25 data pipelines dependency management data streaming
  26. 26 gasket is a cross platform pipeline manager
  27. 27 datscript is an experimental pipeline config language
  28. 28 the future
  29. 29 branches, dat checkout 3b2d98V3, multi master replication, sync to databases, registry

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.