Author: Hotrod
Summary: Download CSV data from the web, reformat data to JSON
Keywords: csv, http
Categories: demo
CSV to JSON
This is a demo pipe that demonstrates how to:
- Download CSV data using HTTP
- Parse that data into JSON
- Perform various transformations
For this demo, the input is country data and metadata in CSV format, which is downloaded directly from Github.
The output that this pipe generates looks like this:
...
{"cca2":"CH","capital":"Bern","region":"Europe","subregion":"Western Europe","num":43,"name":"Switzerland","tld":".ch","latitude":47,"longitude":8}
{"cca2":"CL","capital":"Santiago","region":"Americas","subregion":"South America","num":44,"name":"Chile","tld":".cl","latitude":-30,"longitude":-71}
{"cca2":"CN","capital":"Beijing","region":"Asia","subregion":"Eastern Asia","num":45,"name":"China","tld":".cn","latitude":35,"longitude":105}
{"cca2":"CI","capital":"Yamoussoukro","region":"Africa","subregion":"Western Africa","num":46,"name":"Ivory Coast","tld":".ci","latitude":8,"longitude":-5}
{"cca2":"CM","capital":"Yaoundé","region":"Africa","subregion":"Middle Africa","num":47,"name":"Cameroon","tld":".cm","latitude":6,"longitude":12}
{"cca2":"CD","capital":"Kinshasa","region":"Africa","subregion":"Middle Africa","num":48,"name":"DR Congo","tld":".cd","latitude":0,"longitude":25}
...
Source dataset:
https://mledoze.github.io/countries/
Pipe Definition
name: demo-csv-to-json
# =========================================== input: download CSV file directory from the web
input:
http-poll:
address: https://raw.githubusercontent.com/mledoze/countries/master/dist/countries.csv
actions:
# =========================================== CSV to JSON
- expand:
input-field: _raw
remove: true
csv:
header: true
# =========================================== add a record count
- script:
let:
- num: count()
# =========================================== remove fields
- remove:
fields:
# - name
# - tld
# - cca2
- ccn3
- cca3
- cioc
- independent
- status
- unMember
- currencies
- idd
# - capital
- altSpellings
# - region
# - subregion
- languages
- translations
# - latlng
- landlocked
- borders
- area
- flag
- demonyms
- callingCodes
# =========================================== rename fields
- rename:
fields:
- name: orig_name
- tld: orig_tld
# =========================================== extract first item from name list
- extract:
input-field: orig_name
remove: true
pattern: '^([^,]+)'
output-fields:
- name
# =========================================== extract first item from tld list
- extract:
input-field: orig_tld
remove: true
pattern: '^([^,]+)'
output-fields:
- tld
# =========================================== extract lat/long into seperate fields
- extract:
input-field: latlng
remove: true
pattern: '^(.*),(.*)$'
output-fields:
- latitude
- longitude
# =========================================== convert lat/long from string to number to number
- convert:
fields:
- latitude: num
- longitude: num
# =========================================== output: print to the pipe's STDOUT
output:
print: STDOUT