Misplaced Pages talk:India Education Program/Analysis

< Misplaced Pages talk:India Education Program

This is an old revision of this page, as edited by TCO (talk | contribs) at 02:03, 3 December 2011 (→High level thoughts: ce). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 02:03, 3 December 2011 by TCO (talk | contribs) (→High level thoughts: ce)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Outstanding Questions

A big question for me is: what is the best way to measure the impact of this program on the community? We certainly don't want to put more work on the community to estimate this impact, but we also believe measuring the program's impact on ordinary editors is critical to having a full picture of what happened in the Pune pilot. Any ideas of how to overcome this challenge? -- LiAnna Davis (WMF) (talk) 22:14, 1 December 2011 (UTC)

I think the best way to go about this would be to estimate some sort of "crap fraction" reflecting problems that need to be cleaned up in a similar manner to the PPP metric:

k_{1}(a+b)+k_{2}c+k_{3}d+k_{4}e \over m

with

a = copyvio edits in any namespace
b = copyvio uploads (local + commons)
c = mainspace edits deleted or reverted for other reasons. Redirecting an article counts as reversion.
d = mainspace edits with orange {{ambox}} problems (e.g. lack of RS, POV). If an edit has multiple problems, add the number of problems instead (e.g. unreferenced POV edit increases d by 2).
e = mainspace edits with yellow {{ambox}} problems (e.g. wikify, copyedit). If an edit has multiple problems, add the number of problems instead.
m = mainspace edits

Edits should be assigned to the category with the highest weight (e.g. G12 deletion => 10 points). The weights are arbitrary but reflect the severity of the problem: k₁ > k₂ > k₃ > k₄. I suggest something like k₁ = 10, k₂ = 4, k₃ = 2, k₄ = 1. MER-C 03:36, 2 December 2011 (UTC)

CCI

Attempting to assess the impact on the community of the CCI is virtually impossible at this stage because very little of the work has been done. The investigation is less than a week old and isn't going to be finished for a long time yet (we still have open two-year-old CCIs). I suppose you could investigate what percentage of contributions have been removed as copyright violations, but that's going to produce a severe underestimate because most of the edits haven't been systematically surveyed for copyright problems. Hut 8.5 23:56, 1 December 2011 (UTC)

Would there be data points like the backlog increase or anything like that we could use? We do have a couple of months to do this analysis, so we want to make it as thorough as possible, and if that takes waiting until the CCI process has had a chance to start so there is some time estimate able to be extrapolated, that's fine. -- LiAnna Davis (WMF) (talk) 00:03, 2 December 2011 (UTC)

You could count the number of edits, users or pages which have been reviewed, I suppose, but such a statistic might not be very meaningful because some users are much easier to check than others (some project participants have no edits apart from adding themselves to the project, for instance). It's possible that in a few months the first page of the CCI investigation might be complete, in which case you could get an idea of the number of copyright violations produced by a significant sample of users (196). Hut 8.5 00:17, 2 December 2011 (UTC)

Even that would be an underestimate due to copyright violations that were deleted before the CCI was run. To get a wholesome picture, you must consider both live and deleted edits. The CCI is also not complete: I noticed one user who posted a copyvio in the user talk namespace. The CCI backlog is so ridiculously large that the proportional impact of the IEP is rather small. The best way to assess impact on the community would be a crap edit (all namespaces+Commons)/potentially useful edit fraction weighted heavily towards copyvios. MER-C 02:13, 2 December 2011 (UTC)

Is analysis needed?

"We want to do a thorough analysis of the Pune pilot program to derive learnings and trends"

Or you could spend a few minutes asking some of the people on the (virtual) ground. This doesn't need statistics - a qualitative analysis (which half-a-dozen names could deliver in less than half-an-hour) would tell WMF more than I still suspect they want to hear. A snazzy Powerpoint with figures on is no excuse for A Clue. Some of the basics of what went wrong are very obvious and they just need fixing. We don't care whether they went a lot wrong or a little, they went too wrong, and they need to be avoided completely in the future. Analysing trends will not explain any of this.

I'd note also that some of the really deep questions rely on the so-far invisible relationship between the WMF and the Indian colleges. Who expected to gain what from this entire process? Without knowing what the intention ever was, it's hard to know just where it went wrong. Andy Dingley (talk) 00:18, 2 December 2011 (UTC)

Hi Andy, Please note that both quantitative and qualitative analyses are happening! As is mentioned, Tory Read (our outside evaluator) is interviewing a few Wikipedians for her report (I see from Hisham's talk page that you were one of the ones asked). We've also been taking note of all the information that's been previously mentioned on the talk pages. If you think we need more editor interviews, suggest that! We want to do such a thorough analysis because everyone we talk to has a different answer of what's wrong with our program, and I think that's because our program design was flawed on multiple points. Fixing one or two of the problems will just result in another frustrating pilot; we want to take the time to plan the next one right, and that involves a very thorough analysis to identify all the reasons why our first pilot failed, and what we can do to fix those problems.

I'm glad you said you think the relationship between WMF and the colleges is invisible, as I wasn't aware that was an issue. The Misplaced Pages Education Program (this is true for our program in India as well as our current programs in the U.S. and Canada) is designed to get more people contributing to Misplaced Pages and to improve the quality of the content of Misplaced Pages (you'll recognize these as goals from the Strategic Plan). Obviously the quality part failed miserably in our Pune pilot, but data from our U.S. pilot showed that article quality improved 64%, which led us to expand the pilot to other countries, including India. The India program is designed to specifically address the Global South parts of the Strategic Plan. You can see the benefits for instructors on the Education Portal's Reasons to Use Misplaced Pages page. If you're looking for more details on the communication between WMF and various other parties in India, I encourage you to check out the Misplaced Pages:India_Education_Program/Documentation page. I'd be happy to clarify any other questions you have about this -- I'm really sorry it wasn't clearer before! -- LiAnna Davis (WMF) (talk) 00:55, 2 December 2011 (UTC)

Extolling the virtues of some relatively minor successes is only to cloud the real issues. The Pune experiment was a major success in that it has clearly demonstrated how not to plan, organise, and execute what are nevertheless extremely important initiatives in the effort to expand the encyclopedias and reach out to other cultures. The endemic gap between the salaried operatives and the volunteers needs to be closed, and there should be less unilateral action on the part of the organisers without listening to the community amongst whom are also some experts (who are not paid). The organisers have been aware for a long time that the relationship between WMF and the colleges is invisible, as well as the issues concerning the general planning and management of the projects from the top down and the absence of communication and cooperation with the online community.--Kudpung กุดผึ้ง (talk) 03:59, 2 December 2011 (UTC)

"important initiatives in the effort to expand the encyclopedias "

That's my main point. Was this an effort to benefit the encyclopedia my adding content, by adding new editors, or to benefit the students/college by giving them an exercise? I don't want to care if this stuff is invisible or not, I don't want to even have to think about it, but when it's time for analysis we have to know what the original hope was, before we can judge whether it met it. All of these goals have some value to them and appear to be a simple goal to achieve. In actuality, none of them would be anything like so easy to meet.

Expanding the encyclopedia is done by having competent people add competent content to it - as a minimum. Usually this arises through some personal passion for a topic and prior knowledge. Students who are arbitrarily assigned a topic are unlikely to have this. It was a huge problem for IEP engineering topics. Students obviously didn't know the topics, didn't stop and learn them before starting to write and had no inclination to even try to - most seemed to think that blindly pasting was enough, without any intervening comprehension. Engineering topics were also poorly worded as titles with no explanation as to what was meant.
We're regularly told that we need new editors (despite the loss of good, undervalued editors being a far more serious problem). Recruiting them from students sounds like a good idea, but getting past WP:BITE needs a fair personal commitment, not just a course assignment. We not only asked students to edit, we asked them to deliver an acceptable article from cold, on their first article. That's hard to achieve on a first edit and it can only be done by people who know something well enough to write it, and who have enough editing skill to produce it. How do we train people to this level? As general quality standards are pushed higher, this is going to become a bigger and bigger problem - how do we get new editors past this skill gap without driving them away first by instant reversions and warnings?
As a benefit to students, it's hard to see how it can possibly work. What's it for? To teach a topic, or to teach the technique of writing for a public audience? The second has some attraction, but where was the teaching? Students were dumped into the end of course assesment exercise without being taught anything beforehand. Just how was that expected to work?
Any benefit to tutors is probably a bad idea from the outset. "Here's a chance for an easy self-marking exercise where a pre-existing community can be borrowed as tutors and assessors" is bad teaching, and sheer exploitation of the WP user community. I suspect this was one of the real goals behind this project.

The ones I feel sorry for in the midst of all this were the students. They were given a poorly thought out and broadly impossible task, then criticised from both sides afterwards. Andy Dingley (talk) 11:22, 2 December 2011 (UTC)

Programmatic results

To be certain we are measuring what needs to be measured, I'd like to see more detail regarding the parameters/dimensions of the program. I suggest starting with a list of the program goals and the planned program components and add a list of the unintended/unexpected outcomes. Then quantify those items and analyze the relationships among them. Jojalozzo 00:51, 2 December 2011 (UTC)

Great idea. I'll start working on this list. -- LiAnna Davis (WMF) (talk) 01:11, 2 December 2011 (UTC)

Is something that will be shared and added to the plan? What is the time line for conducting the analysis? Jojalozzo 19:29, 2 December 2011 (UTC)

Core issues

The answers do not lie in added bureaucracy: proposed solutions and metrics are all available already in the many and various comments by community members on the increasing maze of talk pages connected with the IEP and USEP projects. The impact on the community is blatantly obvious - why keep creating yet more pages to add to the confusion, and carry out further costly analysis when the answers are already staring us in the face complete with charts and graphs already provided, and have been discussed in depth on the mailing lists and other obscure lines of discussion? The main answer lies in rectifying the continuing lack of communication, transparency, and admission of errors. The main concern raised from the Pune pilot from the community angle is not being addressed, and LDavis and/or AnnieLin have made it clear elsewhere that they do not consider it part of their remit to take the community resources - the very impact that is being mentioned here - into consideration when planning their education projects; ignoring the known problems and trying to find solutions to new new ones that apparently still need to be identified is a redundant exercise. Ultimately, this will simply foster more ire and drive yet more volunteers and OAs away from wanting to be helpful, rather than solicit their aid which in any case can only be to repeat what they have already said time and time again. Solutions and suggestions have been tossed around by some extremely competent and knowledgeable members of the volunteer force only to land repeatedly in some kind of no man's land between the WMF and the community. It is imperative to understand that all education programmes will generate more articles - which is of course the goal of the initiatives, and which is recognised and supported in principle by everyone - that will still need to be policed by experienced regular editors, and that these programmes cannot be implemented before the online volunteer community is forewarned, and forearmed with the required tools and personnel. Perhaps Tory's independent analysis will come up with some answers (and I'm confident it will), and it may be best to wait for her report. . Kudpung กุดผึ้ง (talk) 03:33, 2 December 2011 (UTC)

Kudpung: Would saying that for the next India pilot, students are required to expand stub/start class articles rather than creating new articles address your NPP tools concerns? I do hear you that the NPP people need better tools to do your jobs, but I'm afraid WMF isn't going to put all outreach projects that might generate new pages on hold until those tools are fixed. As I said before, I (as a staff person in the Global Development department) have *much* less influence on the tech team's roadmap than you (as an active community member) do. In an ideal world, would WMF work in concert to ensure that all activities across all departments had adequate tech support to assist in all areas of their projects? Absolutely, but I have yet to find a place that works like that, and WMF certainly doesn't. So let's try to figure out a way around that issue for the time being by discussing ideas like requiring students to work on stub/starts class articles only. What other ideas could address this problem?

I don't know how much more we can take "community resources...into planning" than ask on this analysis than to ask the question: "What was the impact of this program on the community?" If the analysis says the Pune students caused the NPP's 4X the work that 800 normal newbies would, we obviously need to do something to make sure students aren't creating new pages and thus putting undue work on NPP. So how to we determine that impact? I really value your input, Kudpung, and I would like you to participate in this analysis. I completely agree that we made mistakes in our lack of communication, transparency, and admission of errors. But I feel like that's what this analysis is trying to rectify, and I'm sad to hear you think it's "continuing". I posted talk page messages to talk pages alerting you this page was here. We're being transparent about how we're doing the analysis, and we've posted all the emails that should have been posted earlier in the Misplaced Pages:India_Education_Program/Documentation page. We have admitted making a lot of mistakes. I honestly can't think of any piece of information we haven't released publicly at this point. You know what I know. I talked with Tory as well, and I look forward to her report on January 15, just like you do. I don't know how more transparent we can be, but please do tell me if I'm missing something. In the mean time, I do hope you'll help us figure out how to measure the impact of these programs on the community so we can brainstorm ideas of how to lessen the load on volunteers like you. -- LiAnna Davis (WMF) (talk) 16:21, 2 December 2011 (UTC)

I don't care about the precise size of the impact. It's clear that it was negative and appreciable, that's as much detail as we need. The useful question is about its cause, and how to avoid that happpening again.

As to expansion vs new pages, then this should be chosen for their effect (a complex issue, discussed previously) rather than for whether they go through NPP analysis or not. Whether NPP has adequate tools or not, or whether NPP applies closer scrutiny, should never be allowed to become an issue here. Would we allow low-quality expansion of articles to go ahead, just because our only quality focus was at NPP? Hopefully not! Andy Dingley (talk) 19:42, 2 December 2011 (UTC)

To be honest even if the average project participant did cause less damage than the average enthusiastic newbie that doesn't mean much. For all we know our infrastructure may not be equipped to handle a huge influx of industrious new editors. We should in fact expect the project particpants to be significantly better at editing than the average newbie. Otherwise all the support and advice given to project particpants would have been less effective than pointing the students to pre-existing help pages and the program would have wasted a lot of effort.

It's pretty obvious from reading the contributions of people involved in the project that the overall standard if very low. Even if you ignore the copyright violations many or most of them violate core principles or are on subjects which are inappropriate. I don't see that putting an exact figure on how bad they are is going to be much help. Hut 8.5 21:10, 2 December 2011 (UTC)

High level thoughts

1. There seems to be a lack of savvy about how Misplaced Pages itself works (what our content is, how articles get created, what the trials and tribulations are) from the WMF staff. Well, especially in GEP. I advise those of you who have not experienced the factory floor (uploaded an image, been reverted, had an edit war, used the citation template, etc.) to write an article or two. Get a DYK or take some article up to B or so. Get some feel for this place. It will really help you as you do external work if you understand the internal side of things. We are all trying to figure out Wiki as a system and how to make it work and grow and get better. Make it committed work. A little time spent on training yourselves...will pay off in more effective work in the future. The community is seriously torqued and rightfully so.

2. Going after Indian engineering college students was just daffy. They are not good English speakers (I don't care that they try to work in English...the issue is from our perspective they are not good.) Is Pune even a good school? IIT is the one I hear of as tops in India. Let's aim high. Let's go for people who can help Wiki. For the easy fits. Not the hard ones.

3. The programs in the US that seem to have worked better (Sage Ross, Jimmy Butler, JBMurray) did so because of small scale (allows more help) and committed teachers. I'm not saying we can never scale. But the whole approach of going for mass when we don't really understand how to do this stuff well, seems flawed. Cherry picking seems better.

The GEP unit should be judged a bit more on what it learns, not what it delivers for a while, not on these mass experiments. If you do it right and figure out the right model, it will scale then. But I'm not sure it is the current ambassador system (don't know). Perhaps it is writing something up with Jimmy Butler and then getting together with the organization or association of AP Bio/Chem teachers. Perhaps there is some way to spread more like an inkblot, more virally (have successful teachers pitch others). Also, are we monitoring how teachers do in returning to using Wiki? Or is it new prospects every year?

And I do give you credit for trying things, for your initiative. I mean people fly around and set up these programs and make things happen. Wiki community can be very conservative and study too much and not act. So I do give credit for trying things. But...we need to think through this puzzle more, now.

I guess the PPI/USEP also stands out as a success (maybe) and it was "big", but I would like to get a little more info on details of cost and benefit (include the volunteer time, especially if it is people like Christie diverted from content work) and what classes or approaches worked and didn't. Still feels fuzzy to me. I don't understand it the way I understand FA/VA where I cut the onion in several directions. (And you'll never understand anything perfectly or have time to do every analysis imaginable. But...just what I've seen so far on PPI has lacked detail. I just don't feel like my arms are wrapped around understanding it.)

4. I don't understand the participation objective. Frank seems to say it's not an objective--just giving the schools exercises is our mission? Or he thinks this is well spent money for the content recieved--doubt it. Also it doesn't seem like we are measuring retention of the new editors or had a goal in mind or a hypothesis to disprove (in terms of percent retention).

5. (small insight) One of the cool things about working with Jimmy Butler is that his high school kids have the whole year. There is just more of a feeling of ongoingness than with semester college kids that are really there for 16 weeks or so (less on trimesters). Also, the teachers have more continuity in terms of teaching the same classes year to year (vice uni where the profs vary every year what they teach). Maybe consider attacking the magnet schools? Or Thomas Jefferson and Bronx School of Science? Donno, just some wild ideas. (But TJ or BSS would also be nice "names" for a press release, hint, hint.

6. Consider trying to go more "high end" instead of mass. Try for better schools. Try for committed teachers.

7. Think about subject a little more. Animal articles are very contained and we have a lot of them to be upgraded. They sort of "just work". The topics may be somewhat obscure, but there are a lot of them and they just can be used as training grounds. The engineering students trying to write articles struggled though. There is not the same diversity of topics. And they were messing up fundamental topics like lever or welding.

A Shakespeare class might have the same problem (unless perhaps a graduate class...) with finding it hard to readily contribute...as there are really just a few plays there and they may have good articles. Donno, scared to check in case it depresses me!

Chemistry is probably another "good topic" where you can probably just find a lot of compounds to write about and they have a reasonably strong structure and can just be used a bit more for training. (But keep them away from elements please...off in random compound land. Elements are too important and vital. Doubt an individual unless Ph.D. grad student would do a good job.) One concern with chem is that it is rather an intense class, more mathematical and exercise-problem solving (and more labs) than descriptive biology. So I don't know if the kids have time to add in a research project. But...it's an idea. Oh...and before you roll it out in mass, prove it with one select class at least.

What are some other topics that are well suited or not well suited? Perhaps geography is well suited because of the diversity of topics and the structured nature of the articles? That is not really tought as a subject much in the United States any more, but perhaps in the United Kingdom. Or perhaps it is close enough to satisfy history teachers (it's sort of similar and the kids will still get the basic learnings of researching information and writing endnote citations).

I suspect GEP has been hocking the Wiki program to profs with a slick presentation and a willingness to take any customers and saying the offering fits all customers. (Guessing here, though.) Perhaps instead if we really think about who are the best prospects (and this post is not enough, it's more sitting back and really wrestling with it) and then targeting the "good customers", then we will do better.

Oh...and India should be on the deep freeze. Let them earn their way back with some cherrypicked 30 person class at IIT with the one superstar teacher able to make it work.

TCO (talk) 02:02, 3 December 2011 (UTC)