This is my first project in a Cloudera Quickstart Container. This is a low level approach of getting multiple large (100GB) files and combining them into hdfs. Uses multiprocessing and runs quickly. At most, this uses 7-8GB of memory.
novaferg/cloudera-python-test
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|