Let’s start with how to read csv, json files:
1 2 3 4 5 6 | import glob list_csv = glob("*.csv") list_json = glob("*.json") |
Let’s create a function to read a CSV file in python:
1 2 3 4 5 6 7 | def extract_from_csv(file_to_process): dataframe = pd.read_csv(file_to_process) return dataframe df = extract_from_csv("example1.csv") |
Now, let’s create another function to read content from JSON file in Python:
1 2 3 4 5 6 7 | def extract_from_json(file_to_process): dataframe = pd.read_csv(file_to_process) return dataframe df = extract_from_json("example1.json") |
We can combine both functions in an extract function and build our dataframe.
1 2 3 4 5 6 7 8 9 10 11 12 | def extract(): extracted_data = pd.DataFrame(columns = ["col1", "col2", "col3"]) for csv_file in glob.glob("*.csv"): extracted_data = extracted_data.append(extract_from_csv(csv_file), ignore_index=True) for json_file in glob.glob("*.json"): extracted_data = extracted_data.append(extract_from_json(json_file), ignore_index=True) return extracted_data |