Questions tagged [active-learning]
Active learning is a setting where an automated learning system can request labels from an external source, perhaps a human user or a real-world experiment. It is used to try to learn good models while minimizing the number of interventions required.
53 questions
0
votes
0
answers
51
views
Outlier detection in many short time series
I have a dataset with ~20.000 entries containing mean values for different groups. The groups are defined with 4 categorical columns and I have the week number, the number of samples per week and the ...
2
votes
0
answers
54
views
How best to approach ML classification active learning loop with ratio data?
I am currently on a project trying to investigate structure-property relationships between molecules and their phase behaviour (which is binary in this case). The molecules have a backbone, any number ...
8
votes
1
answer
335
views
Expected Error Reduction for regression problems
I would like to implement in Python an active learning approach based on Expected Error Reduction (EER):
https://axon.cs.byu.edu/Dan/778/papers/Active%20Learning/roy.pdf
https://arxiv.org/abs/2211....
5
votes
1
answer
93
views
Active learning for simulation parameters
Suppose we are training a neural network (or more generally, your favorite nonparametric model) $f: \mathbb{R}^k \to \mathbb{R}$ to solve a regression problem. For clarity, $k$ is of the order $\sim10$...
2
votes
1
answer
145
views
Best interpolating points for a Gaussian process regression
I have an unknown function $f(x)$, defined on a domain, that is modeling a perception function based on a human user response. I estimate it with a GP with mean $\mu$ and kernel $K$. I determine the ...
0
votes
0
answers
97
views
Best function value for Expected Improvement computation for Gaussian Process Regression
I want to implement Gaussian Process regression in the context of active learning, in which interpolation is performed with the best interpolating points, selected at each step iteratively. At every ...
2
votes
0
answers
53
views
Active learning for Computer Vision where I have to generate my own images
I'm interested in applying active learning to a computer vision bounding box prediction project, where I don't have a large corpus of unlabeled images available, and instead have to take all pictures ...
3
votes
1
answer
111
views
Statistical test to determine if Active Learning has provided a significant improvement?
I am conducting various active learning experiments on two Biomedical Relation Extraction corpora:
2018 n2c2 challenge: 41000 test samples
DDI Extraction corpus: 5700 test samples
and using four ...
1
vote
0
answers
52
views
Can models improve the more you use them?
I came across this interesting question on Quora (someone else asked): How can I penalize a deployed classifier for every wrong prediction in the production run? I want to implement an online model ...
4
votes
3
answers
520
views
Techniques for strategically crafting a ML dataset
For a supervised machine learning application where the input features can be readily calculated and the corresponding labels are the result of a somewhat time-consuming simulation using the inputs, ...
0
votes
1
answer
96
views
Name and origin of a metric for evaluating ranking performance
I came across an evaluation metric to test whether a predicted rank is good, especially the top k items. But I don't know what it is called or where it is used, which makes my discussion of this ...
0
votes
1
answer
82
views
How to adaptively select stimuli in paired comparison experiment?
I have 500 items of which I want to know how much they are liked by a specific expert.
To find how much they are liked, I wish to use paired comparisons since this yields more accurate results than ...
2
votes
0
answers
123
views
How to interpret the relationship between batch size and bootstrap count in a specific paper?
In the paper "Active Learning for Natural Language Parsing and Information Extraction", the author mentioned:
In tests on this data, test examples were chosen independently for 10 trials ...
1
vote
0
answers
33
views
Reasoning about increased performance from two disjoint datasets compared to their union
Say I have some pretrained model with an ok but not great accuracy. I can augment my existing training data with new datasets and retrain to increase performance. Say there are two datasets available, ...
2
votes
0
answers
52
views
Which item to administer to improve the standard error of the item difficulty estimate the most?
I have a student with a known ability level and a set of items with an estimated difficulty. The probability of a student getting an item correct follows an 1 parameter logistic model (also known as ...