A lot of question for the ConQuest program

#### Morten Puck

A lot of question for the ConQuest program
January 19, 2015, 10:43:39 AM
hi ConQuest helpdesk.

I am working at Aarhus University, department of education, and we are currently doing a huge project which goal is to test students in 3rd to 8th grader. We have conduct the tests and the tests have being scored. Either autoscored or manual scored. All in all we have 237 items, and 2395 students. Of course not every students do not participate in every items, but instead have complete 2 booklets, which each have around 40 items. Totally we have 6 different booklets.
Our main goal is to find the PV of each dimension in our test, but we have some hassles with ConQuest.

1) How can I make a dataset, that are sufficient and complete to do fast calculations.
At the moment have I constructed an .prn file, which contains all information about each students, and their scores in each items. I do not think that format .prn is the best solution, but it works for me even though it is slow to do calculations. (approximate 20 minutes.) Do you have a better solutions.

2) Our dataset contains a lot of numbers. the scores (0,1,2,3,4,5,6) and some instruments to allocate missings, computer errors, not assigned items. our dataset have 7 as an indicator for computer errors, 8 as an indicator for not assigned items, and 9 as an indicator for not reach and missing answers. My question is a 2 headed question.
2.1) How should one treat 8? Should it be treated as a missing (M), or not reach (R), or as a dot (.), or the last solution not in recognize it, thus that I just make a code (0,1,2,3,4,5,6,7,9).
2.2) The items is a mixture of dichotomous and categorical items, where the number change meaning depending which type of item we are looking at. 1 in a dichotomous item stands for correct, but in a categorical item stands for partly correct answer. It is the solution for the challenge to make a key statement?

3) As I mentions at the top, we have 237 items in our tests. When I try to include all items I get a warning that the version of ConQuest is limited to 100 items. This is a problem, since the booklets vary in severity, because different booklets have different targets group between the graders. If I just took the first 100 items, then will I not have any 8 grader responding any of the items, because the booklets are target 3-6 graders. I am using ConQuest 3 (GUI mode). Is there any way how I can avoid the 100 items limit?

4) I want to do a model where I test the structure of data. Right now model statement is: model item + step; Should I do a model statement instead saying: model item + item*step + step; ? And a bonus question, for testing for DIF should I include this in the model statement? should that the model statement will be: model item + item*step + step + item*gender + step*gender + gender;?

5) One last question for this time: When I want to export the PV want is the smartest way to do so? If I use tables a write PVs, I get a text-file where I get 5PVs for each dimensions per students. But how can I link each PVs to the students in the text file? Or should I do a command: show cases ! filetype=excel, estimates=latent >> latent.xls; ?

I do hope that I make clear and understandable questions, and you hopely can help me, thus that I can continue the work with this huge project.

Sincerely
Morten Puck
Research Assistant, Aarhus University

#### Eveline Gebhardt

Re: A lot of question for the ConQuest program
Reply #1 on: January 21, 2015, 10:24:00 PM
Hi Morton

All good questions and perfectly clear. Here my answers.

(1) The type of file does not influence the estimation time. If you use an SPSS data file, ConQuest take a little longer to read the data in, but after that the speed is the same compared to a text file.

(2.1) Your last solution is correct, codes that are not included in the codes statement are ignored by ConQuest.
(2.2) If the data is already scores, you do not need a key statement, CQ will treat a score of 1 as full credit in a dichotomous item and as partial credit in a PC item.

(3) You will need to apply for the professional licence. If you have purchased the standard licence, you do not need to pay any extra. What you need to do is send your standard licence key to Ray Adams (adams@acer.edu.au) and briefly explain what you will use ConQuest for. Then he will send you the professional licence. It is valid for 1 year. After that is becomes a standard licence again or can can apply for a new professional licence.

(4) Your model would be item + item*step. If you do a DIF model, you need item + item*step - gender + item*gender (we usually do -gender so that the estimates reflect the ability of each group and higher values reflect higher abilities)

(5) If you include PID in the datafile statement (see manual), then ConQuest will add the ID variable to the file with plausible values. I usually export PVs into an SPSS file, but I think excel may work as well.

Good luck with the project and let us know if you have any more questions.

Eveline

#### Morten Puck

Re: A lot of question for the ConQuest program
Reply #2 on: January 22, 2015, 09:05:22 AM
Dear Eveline.

Thank you so much for your answers so far. I Do have some comments and a few more questions, if you do not mind.

1) I can see your point about the estimation time. I think that the differences I have observed between different types of files are due to that the datasets are different. So here is a small warning for anybody else read this. Different types of files can be different in the sense the it could change the number of observation and the number of items. We did solve the dataset issue by starting in Excel, with each item in their own column, where we can delete each item that should not be in the dataset. When the dataset is finish, we then save the file as a .csv. We then  opened the .csv file in word where we eliminated all the semi commas. At last we saved the word file as a text file, that we could use in ConQuest. Can we directly use a .csv file in ConQuest? And can we use other type of files, let says files from SPSS or SAS?

2) Ahh I see. The only reason that I have used the Key statement, is because I got a warning every time I used the command: model item + step. But I do not get any warning when I use the command: model item + item*step. So I will just ignore the Key statement when the items are scored.

3) I will contact your colleague about the professional licence. That will make the process with the many items much easier. Just an additional question: Why is it that we do not get the professional licence when we are purchasing the licence for ConQuest? Perharps there is an argument that I do not know.

4) Can I do multiply DIF tests at the same time? I want to DIF test my item according to gender, age, and immigration status. (Possible more I do construct/have the variables for it). How should my model statement be? model item + item*step + gender + gender*item + age + age*item + immigration + immigration*item;? Or should it be: model item + item*step + gender + gender*item + age + age*item + immigration + immigration*item + gender*age + gender*immigration + age*immigration + gender*age*immigration + gender*age*immigration*item?

5) I will try to include the PID. Just another question about the PVs. Do ConQuest by itself find all the dimensions in the data, and give me a set of PVs of each dimensions? Or do I have to find the number of dimensions by myself, and give a command to conquest that I want that many dimensions, and that many sets of PVs? Because right now I have that I only get 1 set of PVs, and we expects there should be at least three different dimensions, and I should get at least 3 sets of PVs. I do know that I right now have a limited set of items, compared to all of my 240 items. But each booklets should have all the dimensions and therefore I should get at least 3 sets of PVs.

But yet again thank you so much for your time.

Sincerely
Morten Puck
Research assistant, Aarhus University.

#### Eveline Gebhardt

Re: A lot of question for the ConQuest program
Reply #3 on: January 27, 2015, 02:11:36 AM
Dear Morten

(1) ConQuest reads in text and SPSS data files (please have a look in the command reference guide that is provided with the software).

(4) I'd keep it simple and estimate each DIF model separately unless you have a very good reason to combine two variables. I will be very difficult to interpret all those interactions you include below.

(5) You need to assign items to dimensions with the SCORE statement. IRT is a form of confirmatory factor analysis, so you need to define the model. You can export PVs directly into SPSS. It's probably easiest if you look up the command in the reference guide and  then open the SPSS export file with PVs to have a look. Please let me know if that raises any questions.

Eveline

#### Morten Puck

Re: A lot of question for the ConQuest program
Reply #4 on: January 27, 2015, 08:31:19 AM
Dear Eveline.

Again thank you so much for your help. But some of the help raises more help.

1) Okay, I think I will stick with a text file. Because then can I do any editing to the datafile in excel at first.

2) When I try to drop the Key statement and even trough the items are scored I still get a warning. So I have to have the Key statement to do any estimation. It is not that a big of problem, but is it a commonly problem?

4) So you will do many model statements: model item + item*step; model item + item*step + age + age*item; model item + item*step + gender + gender*item; etc. One question: Can you do multiple model statement in one input window?
4.1) When I do a model statement like: model item + item*step + age + age*item; I get a warning saying that ConQuest are limited to 1000 items. I do not know how ConQuest gets the idea that there should be over a 1000 items. When I just do the model statement: model item + item*step; is there no problem to do the estimation.

5) regarding the plausible values I use the commando: show cases !filetype=excel, estimate=latent >> C:\name of file.xls PID; Should that generate a file with the PVs as soon the estimations are done?

6) A new question. The estimation time for the program right now is over 4 hours. Is ConQuest doing a pair-wise estimation, because I have missing and therefore taking a lot time, then there are 243 items, making 243*242=58806 estimations? And then with 2395 observations, I think that is why the estimation time is very long. A confirmation of suspect will be sufficient.

Thank you again for your time.

Sincerely
Morten Puck
Research Assistant, Aarhus University

#### Eveline Gebhardt

Re: A lot of question for the ConQuest program
Reply #5 on: January 27, 2015, 11:27:45 PM
Hi Morton

(2) Sorry, I can't tell without knowing what the problem is.

(4) Exactly. No, you can only run one model at a time.

(4.1) I am guessing you age variable is continuous? You need a categorical variable here.

(5) Yes, just after the estimations are finished, ConQuest will estimate posterior distributions and draw plausible values.

(6) It doesn't do pairwise estimation. 4 hours is quite normal when you include a regression model for drawing plausible values and multiple dimensions. (Some extreme models we have run took up to 2 weeks!)

Cheers
Eveline

#### Morten Puck

Re: A lot of question for the ConQuest program
Reply #6 on: January 28, 2015, 07:57:58 AM
dear Eveline.

Thank you so much for your time.

2) That is okay. My coding is working, and sometimes you do not need to know why.

4) Okay that was also my suspicion that it can only have one model at the time.
4.1) My variable should be categorical. The age are coded to be 1,2,3,4,5,6,7. I do not know if that is to many categories? Could this have an effect on the estimation?

5) Perfect that it writes an excel file with the plausible values. I just have one minor problems here. I have the PID statement at the end, and I get a column in the excel file that says PID, but the IDs are not in the column. Am I missing something?

6) Holy moly 2 weeks of estimation time. That could be a long vacation time.    But okay, now I am less worry about the long estimation time.

Sincerely
Morten Puck
Aarhus University

#### Eveline Gebhardt

Re: A lot of question for the ConQuest program
Reply #7 on: January 28, 2015, 09:08:21 AM
4.1) In that case it is OK to use the age variable. You have more than 1000 generalised items indeed (243 items plus step parameters plus 7 age groups plus 7*243 interactions). (I did not know there was a maximum of 1000 items. I'll check with Ray.) We usually run those models in two steps. First we calibrate the items in each dimension(so that is 3 models if you have 3 dimensions). Part of this process is to check the items for DIF. After you have selected your final set of items for each dimension, you run a conditional model where you anchor the item parameter to the estimated values from step 1 and you include a regression model to draw plausible values. If you have 3 or more dimension you may like to choose montecarlo estimation method, which is quicker (but less precise) than the default method. (You should use the default quadrature estimation method when you estimate item parameters in one dimensional models.) If the standard errors on the estimates are not so important for you, you can estimate quick SEs (stderr=quick) in the estimate command instead of full ones.

5) PID needs to be defined in the datafile statement.

6) Especially holy moly when you're told to change the model several times!

As an extra tip, you could first run your model using only 5 iterations (iter=5 in the estimate command). This way you can see what is included in the output files without waiting for the model to converge.

Cheers
Eveline

#### Morten Puck

Re: A lot of question for the ConQuest program
Reply #8 on: January 29, 2015, 08:15:41 AM
Dear Eveline.

regarding 4.1) Would you do the model by only one model statement and one score statement to precise the dimensions. Or would you do a model for each dimension?
I think I rather accept a longer estimation time, if it gives me a more precise estimate and standard error. So I think I will stick to the default method.
Can you give me an update as soon as you have heard from Ray? I would been nice to the estimation in one gone, rather then splitting the dataset, and the estimation.

5) I think I did the PID command right this time, but I will check as soon the latest regression is finished.

6) I think it is a common problem to change the model several times.

Sincerely
Morten Puck
Research Assistant, Aarhus University

#### Eveline Gebhardt

Re: A lot of question for the ConQuest program
Reply #9 on: January 30, 2015, 12:39:20 AM
The maximum number of generalised items per person (!) is 10,000, so I think you have a mistake somewhere. Have you defined the codes for missing by design correctly? That is, did you exclude the code for items that were not included a student's booklet from the CODES statement? How many items were included in each booklet?

Eveline

#### Morten Puck

Re: A lot of question for the ConQuest program
Reply #10 on: January 30, 2015, 08:36:40 AM
Dear Eveline.

The probability that I have make a mistake is high. p closes into 1.

Our dataset are containing a lot of data. The scored items have the numbers from 0 to 6. The missing variables caused by computer errors are coded 7. Not assigned booklets/items are coded 8. And Not reach are coded as 9.
My code and recode are:
code (0,1,2,3,4,5,6,7,9);
recode (0,1,2,3,4,5,6,7,9) (0,1,2,3,4,5,6,M,R);

So I do not code the not assigned booklets. I am not certain at I am doing the not assigned booklets/items right. What do you think about the statements?

Sincerely
Morten Puck
Research Assistant, Aarhus University.

#### Morten Puck

Re: A lot of question for the ConQuest program
Reply #11 on: March 10, 2015, 07:54:10 AM
I do have a very long message, so I will try to cut it in half, and see if I can post it. I have some problems when I am posting all at once.
Dear eveline.

I do need your much appreciated help agian.
I want to do multidimensional model, and I have split up each item into one of the five dimensions. This have been done by a theoretical framework of a factorial analysis.
My coding is:
datafile \\s-succesix1\win7_profiles\$\morrp\Desktop\Demonskoleprojektet elevscorer\granitblok.txt !PID;
format id 1-6 gennem 7 age 8 gender 9 indvandre 10 responses 11-253;
labels << \\s-succesix1\win7_profiles\$\morrp\Desktop\Demonskoleprojektet elevscorer\navne.dat;
code 0,1,2,3,4,5,6,7,9;
recode (0,1,2,3,4,5,6,7,9) (0,1,2,3,4,5,6,M,R);
key 111111111111X111111111111111111111111XXXXXXXXXXXX1111111111XX11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111X11XXXXXXX111111111111111X1111111111111111111!1;
key 222XXX2X2XXXXXXXXXXXXXX22X22222222X2XXXXXXXXXXXXX2X2222222XXXX2222222222XXXXXXXXXXXXXXXXXXXX2XXXXXXXXXXX2XXXXXXX2XX22XXX2XX222222222X2X222XXXXXXXXXXXXXXXXXXXXXXX222222X2222X222X222222222222XXXXXX22XXXXXXXXXXXXXXXXXXXX2XXXX2X2XX2XXXXX222222222X!2;
key XXXXXXXX3XXXXXXXXXXXXXXXXXXXXXXXX3XXXXXXXXXXXXXXXXXX33XXXXXXXXXXXX333XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX33XXX3XXXXX3XXXX3XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX3XXXX3XXXXXXXX33333XXXXXXXX3XXXXXXXXXXXXXXXXXXXXX3XXXXXXXXX3XXXXXX3XXX3XXXX!3;
key XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX44XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX4XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX4XXXXXXXX!4;
key XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX55XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX!5;
key XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX66XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX!6;

to be continued.

#### Eveline Gebhardt

Re: A lot of question for the ConQuest program
Reply #12 on: March 12, 2015, 01:07:11 AM
Hi Morten

I have passed on your email to our helpdesk. Please let me know if they do not get in touch with you soon.

In the meantime, I have received the first part of your message (see above), please post the second part if you are able to.

Eveline

#### Morten Puck

Re: A lot of question for the ConQuest program
Reply #13 on: March 12, 2015, 08:00:44 AM
Dear Eveline.

Thank you for passing me on. I will send you an email if I do not hear from the help-desk within 10 days. I am in Cracow next week, so I am not at my stationary computer.

I have just tried to post the next part of the question, but it won't. It is a bit frustrating, but I hope there is a solution.

Sincerely
Morten Puck
Research assistant, Aarhus University