- In this part we consider models where Y is mis-measured so we observe Y* instead of Y but all X are measured perfectly.
- Useful reference is this one Adhya, S., Roy, S., & Banerjee, T. (2022). Prediction of Finite Population Proportion When Responses are Misclassified. Journal of Survey Statistics and Methodology, 10(5), 1319-1345.
- In such case we assume the following model
$$
\tilde{\pi}\left(\mathbf{x}_i ; \beta, \epsilon\right)=P\left(Y_i^{\text {obs }}=1 \mid \mathbf{x}_i\right)=\epsilon_0+\left(1-\epsilon_0-\epsilon_1\right) \pi\left(\mathbf{x}_i ; \beta\right),
$$
where
$$
\begin{aligned}
& P\left(Y_i^{o b s}=1 \mid Y_i=0\right)=\epsilon_0, \\
& P\left(Y_i^{o b s}=0 \mid Y_i=1\right)=\epsilon_1,
\end{aligned}
$$
So the research questions are:
- how to deal with this issue in case of non-probability samples?
- how we can deal with this in mass imputation estimator case?
- how we can deal with this in doubly robust estimators case?
In particular what is needed here:
- how to deal with misclassification in binary Y variable? - literature review needed
- how to deal with misclassification in multinomial Y variable? -- literature review needed
- what is the bias of the mean estimated using mass imputation estimator?
- what are the conditions and assumptions for data integration with misclassified Y variable?
- TBA
where
So the research questions are:
In particular what is needed here: