Tableau is a great data visualization tool used through the industry. Let's use business formation data given by https://www.census.gov/econ/bfs/csv/bfs_monthly.csv to build a quick and dirty tableau dashboard. We will use Python to do a little data cleaning and transforming to make it workable.
Let's use cURL to download data to a local folder.
curl -O https://www.census.gov/econ/bfs/csv/bfs_monthly.csv
Let's use Python to do some quick data transformation and cleaning to make it readable. We will use BeautifulSoup to scrape the NAICS wiki-table to build a dictionary to make the NAICS sector column readable.
import pandas as pd
from bs4 import BeautifulSoup
import requests as r
import re
df = pd.read_csv('data/bfs_monthly.csv')
url = r.get('https://en.wikipedia.org/wiki/North_American_Industry_Classification_System').text
soup = BeautifulSoup(url)
Let's scrape.
table = soup.findAll('table' ,attrs={'class':'wikitable'}) t = table[2] items = t.findAll('td') data = [] for x in items: if len(x.text)>0: data.append(x.text.replace('\n','')) def splitter(x): if '-' in x or '/' in x: return re.split('[-:,/:]',x)[0] else: return x d = {} for a, b in pairwise(data): a = splitter(a) d[a]=b def converter(x): if 'NAICS' in x and any(map(str.isdigit,x))==True: sec = x.split('NAICS')[1] sector = d[sec] return sector else: pass
References: