289,204 questions
1
vote
2
answers
95
views
pandas last 3 weeks same day average in groups
I have a dataset with a column of groups, dates, day of the week and some data columns. For each date in each group, I want to work out the same day average from the last 3 weeks. I've been scratching ...
2
votes
3
answers
84
views
Python packages installation - pandas & nympy
I have read close to all posts about this topic now but I can get it working, so sorry if you find this similar to other questions, but I can't just solve it based on existing posts.
I find it very ...
Advice
0
votes
1
replies
37
views
Pandas segfaults when attempting to read a multi-line cell
I have a set of TSV files that are filled by serializing pandas dataframes built on the responses I get from querying a large database using GraphQL. In the response I get, I have a field that hosts ...
-2
votes
2
answers
89
views
Pandas is saving a column as a float even though I converted to str using .astype [closed]
I am trying to save a string that will be used as a formula in google sheets.
import os
import re
from pathlib import Path
import pathlib
import glob
import gspread
from pydrive2.auth import ...
1
vote
1
answer
85
views
Nuitka & PyQt6 specifying imports to reducing exe size (currently ~1gb)
I'm working on a project to develop a simple app, based on a Python script for data analysis, using PyQt6 to format the app, and using Nuitka to create an exe. (Yes, I've tried using PyInstaller. My ...
3
votes
1
answer
73
views
faster methods to remove substrings stored in one column from strings stored in another column
hist_df_2["time"] = hist_df_2.apply(lambda row : hist_df_2['timestamp'].replace(str(hist_df_2['date']), ''), axis=1)
I tried this to remove the date part from the timestamp. However, for ...
1
vote
1
answer
126
views
Broadcasting DataFrames across NumPy array dimensions
I'm working with a large Pandas DataFrame and a multi-dimensional NumPy array. My goal is to efficiently "broadcast" a specific column of the DataFrame across one or more dimensions of the ...
-1
votes
0
answers
82
views
Best approaches to applying a function to more than one column in df at once? [duplicate]
Say I have a pandas dataframe of > 2 columns and > 2 rows, I want to apply a function, such as a datatype conversion, to each element in at least two columns. I would like for it to be efficient,...
5
votes
3
answers
247
views
How to query columns that are lists or dicts?
How can I query columns that are lists or dicts? Here is some basic JSON-like data.
[
{
"id": 1,
"name": "John Doe",
"age": 30,
&...
-1
votes
0
answers
41
views
Replacing dataframe column with a subset of a value from another column [duplicate]
I'm trying to replace the value in a dataframe column with the partial value from another dataframe column and not having any luck.
I have this:
import pandas as pd
df = pd.DataFrame({'Action': ['set'...
1
vote
3
answers
114
views
Access data frame from binary file
if I have saved a data frame using pickle in a binary file how can I access it?
def create_dataset(path):
"""
creates an binary file with dataset saved in it.
"&...
-3
votes
2
answers
111
views
How to print the value counts of a user-selected column in a pandas DataFrame? [closed]
I’m trying to write a Python script that allows the user to input the name of a column and then prints the value counts of that column from a pandas DataFrame. Here's what I currently have:
def ...
0
votes
1
answer
56
views
When big data : requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
The problem appears when i have a big amount of values in my pandas df. When i take 27 or 54 values(1 or 2 columns) it works normaly, but when i take more columns it gives me this erorr (i import gs ...
3
votes
2
answers
110
views
pd.Timestamp has attribute .isoformat, but the series accessor .dt does not
I am trying to convert a column of pd.Timestamp objects into a columns of type str where the dates are encoded in ISO format.
The class pd.Timestamp has the handy classmethod .isoformat() that does ...
1
vote
1
answer
107
views
Bizzare "kernel has died" error in pandas df.to_excel() caused by geopandas
When pandas.read_excel(), df.to_excel(), geopandas.read_file() and gdf.to_file() are called in a certain order in different environments, pd.read_excel() sometimes causes "Kernel has died" ...