4,838 questions
Best practices
0
votes
1
replies
37
views
When should data go to Archive vs Reject in Bronze layer (Medallion Architecture)?
Can anybody help with understanding the Archive and Reject folders in bronze layer at Medallion Architecture. Let say i have 4 folders in Bronze namely Raw, Stage, Archive and Reject. At what extent a ...
Advice
1
vote
3
replies
65
views
How to handle heteroskedasticity when detecting anomalies using Z-scores on growth rates?
I’m trying to detect anomalies in a dataset using Z-scores based on the logged index change of a value between two time periods: VAL_t0 and VAL_t1.
The issue:
The variance of Z-scores decreases as ...
Advice
0
votes
1
replies
43
views
Analyze a directory in a performant (cross-platform) way for what file types (file extensions) it (recursively) contains?
Aim
My aim is to analyze a (big) (sub)directory and just find out what file extensions all files have there (recursively).
Additionally, these conditions apply:
I am on Windows, but I could use WSL ...
2
votes
1
answer
83
views
Multiple variable/correlation analysis using Python
I have a machine with up to 58 indipendent input variables and one response variable. If I create 2D-histogram plots of all combinations of 2 input variables and the response variable, I get plots ...
0
votes
0
answers
29
views
Unable to fetch Accurate Performance Max (PMAX) YouTube Video Metrics via Google Ads Script / BigQuery Transfer
I’m currently working on a task to fetch and display daily Google ADS Manager (GAM) records—such as Cost, ROAS, and other metrics—within a data analysis application. I’ve successfully retrieved data ...
-4
votes
2
answers
152
views
How to find the correlation between the two most commonly sold items
How to find the correlation between the two most commonly sold items?
I have data of items sold in the supermarket. I want to summarise:
what are the most commonly sold items together?
what is the ...
2
votes
1
answer
140
views
FFT-based quasi-steady detection issue
I am trying to detect the beginning of a quasi-steady regime in time series data representing drag (Fx) and lift (Fy) forces after an initial transient.
Initially, I used a slope-based method, but it ...
1
vote
1
answer
39
views
How to prefix a specific series of lines (multiple)?
I have a text file, which has an inconsistent timestamp format, that I would like to standardize. This is in a transcript from an interview; ultimately for textual analysis.
What command could I run ...
0
votes
0
answers
36
views
Creating a Line graph & Matrix showing % Difference from Previous Year, allowing for a filter on "Country"
I have raw data with columns "Year", "Period", "Country", "Sales Amount". Please note there are 13 periods in a year, (and not 12). The dates also differ ...
0
votes
0
answers
160
views
Why my Transformer model did not work well when dealing with single cell multi-omic data
The complete codes and data are available at:Google Disk
I'm working on a high-dimensional regression problem and have built a Transformer-based model in PyTorch. While the model trains, I'm observing ...
2
votes
1
answer
81
views
How can I summarize sales data by category in a new column?
I would like to create a new column where I can sum units by quarter.
My table like this
Another column I would summ unit by product and quarter
Edit: Last two col with the expected result
Unit
...
0
votes
1
answer
85
views
What is the meaning of Data variables: *empty* in working with .nc file?
I had the task of analyzing a .nc file. I have not worked with it before, so I did some research online and found some codes.
I was exploring the dataset. I tried:
I am using Debian 12, Python3 and ...
1
vote
1
answer
184
views
Card (New) Visual – Show Scaled Numeric Value (K/M/B) with Conditional Arrow Indicator
I'm visualizing totals using the Card (New) visual in Power BI, and I’d like to add an arrow:
Up arrow when the value is positive
Down arrow when the value is negative
I can achieve this by creating ...
2
votes
1
answer
75
views
Error when running lilikoi.featuresSelection() in lilikoi R package: "non-numeric argument to binary operator"
I'm using the lilikoi R package to follow a built-in example from the official documentation. While most of the steps work correctly, I encounter an error when I
attempt to run the lilikoi....
1
vote
0
answers
75
views
Power BI DAX Calculate not working as expected in Matrix report
I'm trying to create a matrix report where the row headers will be the recruitment Product of Users and Column header will be their subsequent products. The measure needs to count the distinct count ...