90,405 questions
Advice
0
votes
4
replies
89
views
How do I replace a certain file in a directory with the old one if it gets downloaded again?
I've been trying to figure this out for a while, but I can't seem to find a clear answer.
Is there a way to replace a file that already exists in a directory?
Context
I'm building an API that ...
2
votes
1
answer
99
views
Using csv how to write into file, list of dictionaries that sometimes have list inside them?
I need to create list from CSV file (it data is movies information like title, director, and actors. Directors and actors are list while other data is just string) it look like this:
with open('filmy....
1
vote
1
answer
60
views
Perform VLOOKUP-like operation between two CSV files using Miller
I am using miller , to work with .csv files.
There is an issue though that I dont manage how to get around.
So I have 2 .csv files.
product-feeder.csv
name-manufacturer,id-product,name-product,id-...
-6
votes
0
answers
114
views
How to send a date from a CSV to a MySQL DB (datetime_immutable) in Symfony?
I try to send a CSV with this command in Symfony:
protected function execute(InputInterface $input,OutputInterface $output): int
{
$io = new SymfonyStyle($input, $output);
$io->...
-3
votes
1
answer
126
views
Is there a way to directly convert CSV to Parquet with DuckDB in Java?
I am doing some tests comparing DuckDB usage among different languages etc, and I've noticed something strange.
In python you can do the following:
duckdb.read_csv(inputFile, max_line_size=10000000, ...
0
votes
3
answers
123
views
Convert JSON file to CSV using jq, expanding nested array data in multiple columns
I need to convert a JSON file to CSV in a bash script, this a sample file:
{
"total": 1,
"jobs": [
{
"deviceData": {
"deviceId": "...
0
votes
0
answers
77
views
Parsed CSV rows are coming through as Buffers instead of arrays when loading files from S3
I'm pulling multiple CSV files from S3. Each CSV contains several rows in this format:
45,ABC,800046,HJN,9000
The first column is the employee ID.
I want to loop through all the files, parse each CSV, ...
0
votes
0
answers
96
views
Why does Polars run OOM while trying to read a compressed CSV file while Pandas is able to do it?
I have a compressed CSV file compressed as csv.gz which I want to run some processing on. I generally go with Polars because it is more memory-efficient and faster. Here is the code which I am using ...
0
votes
0
answers
45
views
How to BULK INSERT hex strings into a VARBINARY column in Azure SQL (from CSV) without staging?
I am loading data from Parquet into Azure SQL Database using this pipeline:
Parquet → PyArrow → CSV (Azure Blob) → BULK INSERT
One column in the Parquet file is binary (hashed passwords).
PyArrow CSV ...
-3
votes
0
answers
52
views
CSV data not outputting from list comprehension after record count (for loop) [duplicate]
I've been tasked to manipulate a CSV file in Python. Set up a strip & split command to clean up the data, and it works:
with open("GLB.Ts+dSST_cleaned.csv") as csv:
header ...
Tooling
0
votes
7
replies
136
views
One liner to get distinct values of all columns of a tsv
I am looking for a one liner that could be run in a linux terminal that does the below.
Takes as input a tab separated file (tsv) with many columns (~100) and creates a two column tsv output with ...
1
vote
3
answers
214
views
Polars: how to write a column of strings into a txt file without escaping?
I have a .ndjson files with millions of rows. Each row has a field html which contains html strings. I would like to write all such html into a .txt file. One html is into one line of the .txt file. I ...
Advice
0
votes
3
replies
106
views
CSV file editing via a bat file
I have a report CSV file that has some special characters in the header row. I would like to set up a short script in a .bat file to remove these characters, so I can schedule a task to automatically ...
1
vote
1
answer
101
views
Python, parse nested JSON to make it flat for CSV
I'm trying to store API output into CSV/db and can not figure out how I can make for those Key in "tierList". One row in my case should be on bin and I need key as a columns in my output.
Is ...
-3
votes
1
answer
102
views
create dataframe from csv in PythonAnywhere [closed]
I am trying to display the headers of a data frame I created based on a csv file using the PythonAnywhere free version. I keep getting a huge error message and I don't understand what I did wrong.
...