methods when all variables are measured with error

In this case we assume that all variables that we observe $(y_i^*, x_i^*, z_i)$ and $(x_i, z_i)$ in non-probability and probability sample (or population) and $^*$ informs that a given variable is mis-classified. 

Motivating example is as follows:

+ *target variable*: we require English language -- this may be provided in a given ad but for some this may be missing but we could derive this from the text (say the ad is in English or it is stated in the text that English is "our language")
+ *auxiliary variables* ($X$): information about the occupation is missing and we derive this using our classifier
+ *auxiliary variables* ($Z$): information about a given company (measured without an error, say the size, NACE, public/private)

Research questions:

+ how we can deal with such cases? what literature say about that?
+ what is the bias when we estimate regression model on $E(y_i^* | x_i^*, z_i)$ instead of $E(y_i | x_i, z_i)$?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

methods when all variables are measured with error #3

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

methods when all variables are measured with error #3

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions