Check encoding

Check encoding

  1. In the Cloud 9 interface, run the command below to check the encoding.
enca -L none ./raw/listings/LOAD00000001.csv

Datalake

If you use your own Data set and the encoding is not UTF-8, you will need to convert it back to UTF8 using the command below.

sudo apt-get install libc6-dev
iconv -f <Current Encoding> -t UTF-8 <path/dataset.csv>