User talk:Yapperbot/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2

May 2020

Hello, and welcome to Wikipedia. This is a message letting you know that one or more of your recent edits to Sandbox has been undone by an automated computer program called ClueBot NG.

  • ClueBot NG makes very few mistakes, but it does happen. If you believe the change you made was constructive, please read about it, report it here, remove this message from your talk page, and then make the edit again.
  • For help, take a look at the introduction.
  • The following is the log entry regarding this message: Sandbox was changed by Yapperbot (u) (t) ANN scored at 0.952021 on 2020-05-15T13:44:44+00:00

Thank you. ClueBot NG (talk) 13:44, 15 May 2020 (UTC)

Just a note - this wasn't a malfunctioning bot, this was me making a mistake myself when manually trying something. Sorry, won't happen again! Naypta ☺ | ✉ talk page | 13:46, 15 May 2020 (UTC)

Current discussions

and then to

^please add new comments here.

--David Tornheim (talk) 09:15, 16 June 2020 (UTC) [revised 23:57, 16 June 2020 (UTC)]

  • I was here to notify about some issues too. Already done. Also, did you kidnap legobot? What have you done with legobot?! —usernamekiran (talk) 09:40, 16 June 2020 (UTC)
@Usernamekiran: I would ping bot coder to get his/her attention to your question. I believe questions/conerns about the bot should be here, not the coder's talk page. --David Tornheim (talk) 11:06, 16 June 2020 (UTC)
  • @Usernamekiran: Hi, thanks for getting in touch - hopefully you've seen the explanations as to what's going on, sorry for any inconvenience! As to Legobot, no kidnapping involved - it's still working just fine on other tasks, it had just stopped doing the FRS for some reason a number of months ago. Attempts to get in touch with Legoktm about it weren't successful, so I rebuilt the functionality into my own bot. The discussion that prompted me to do that is here. Naypta ☺ | ✉ talk page | 11:16, 16 June 2020 (UTC)

Frequency functionality continued

(permalink)

Discussion continued from WP:ANI (and other locations as noted above). Mathglot (talk) 10:22, 16 June 2020 (UTC) [restored with permlink --David Tornheim (talk) 23:39, 16 June 2020 (UTC)]

Copied from Naypta's talk page (permalink) on 00:17, 17 June 2020 (UTC):

Discussion continued from WP:ANI (and other locations as noted at User_talk:Yapperbot#Current_discussions). Mathglot (talk) 10:27, 16 June 2020 (UTC)

Start of continued discussion

@Mathglot: Thanks for opening up this continued discussion.
Can you commit to looking into an adjustment to the code so that a cold start after some time offline won't repeat this? I wrote my answer to whether or not bot should be turned off during an edit-conflict. I'm willing to commit to looking at the code, but I expect it will take a few days before I have any sense of how it works, given my experience with programming/coding does not include wiki-bot coding. I can't promise I will have the time and patience to sufficiently understand it to verify that this wouldn't happen again, but I will give it a shot. I promise that within the week I will at least get started and will put in at least an hour to looking at it and possibly asking the coder or other bot-coders key questions about how it or certain bot-commands work.
If there is no documentation, I might start (or add to) it.
That's it for this subject for tonight for me... --David Tornheim (talk) 10:48, 16 June 2020 (UTC)
@David Tornheim and Mathglot: I won't comment on where this discussion should be - as mentioned previously, I'm more than happy to go wherever it takes me!
This is as far as I can tell the first time that the bot being "turned off" during an edit conflict has been mentioned. What do you envisage that would do? Also, doesn't that have the potential to create quite serious issues with frequently-used talk pages? (It may also not be possible in the current implementation, as Yapperbot is coded deliberately to use MediaWiki's "New section" functionality to avoid ever having edit conflicts.)
The idea of rate limiting is clearly one that's possible, though. In theory, this issue shouldn't ever reoccur anyway, but in the event that it did, it might be good to have rate limits involved. I already have edit limiting code from the bot trial, which is hooked into the FRS bot, so changing that to have a limit on the number of messages sent to a single user per run (the bot currently runs on Toolforge every hour) would definitely be possible if people think that's a good idea. One alternative would be simply to add another parameter to {{Frs user}} that allows users to customise a daily limit - perhaps with a default of 3, then allowing users to set any number there, or 0 for no limit.
Whatever changes are made, I want to make sure that everyone is happy with them - so please let me know your thoughts! Naypta ☺ | ✉ talk page | 11:09, 16 June 2020 (UTC)
On further reflection, I think there are two good options going forward (although there may well be other ones that I've not considered - I welcome additional suggestions!):
  1. Add a per-week limit to {{Frs user}}. I previously said per day limit, but in the vast majority of cases (i.e. pretty much any apart from this edge case of edge cases) that wouldn't be helpful. A week limit would accomplish much the same thing, just with far more utility in normal times, too.
  2. Build the code of the bot to ship multiple notifications to a user in one template. This has advantages and disadvantages: whilst it'd mean less talk page spam in this edge case, it would also mean that the notification might potentially be less clear, as the heading would have to be just "feedback requested" rather than a category (as they might contain multiple categories). It'd also mean the bot would be less easy to debug if there were issues to come up: at the moment, because each message is a product of a single RfC, it's easy to track back issues if they occur and fix them, which would be more problematic without that clear connection.
Let me know your thoughts Naypta ☺ | ✉ talk page | 11:36, 16 June 2020 (UTC)
Hiya - Allie from IRC here. I would advise putting a hard limit on how many times Yapperbot can write to a specific user's talkpage in a one-hour period, and I would suggest that limit is once - all the FRS notifications for a day should really be delivered in a single edit anyway. I would also suggest implementing a proper rate-limit which takes the per-month limit and uses that to calculate a "cooling-off" period between notifications to a user. For instance, I think I'm set at 30 notifications per month, so a 24 hour cooling-off period would be appropriate, but someone who is set at one notification per month should recieve a notification (on average) every 30 days, instead of just on the first of each calendar month. I'm a bit concerned you're referring to this as an 'edge case' - in my opinion, scheduling is core bot functionality. -- a they/them | argue | contribs 12:01, 16 June 2020 (UTC)
@Alfie: Hi Allie. This is an edge case, because it is by no means normal for there to be this many "new" RfCs to process. If you take a look at the history of the pages Legobot transcludes RfCs onto - for instance, take the Biographies category - you can see that, on a daily basis, there's normally one, maybe two, RfCs per category. Ninety-nine to process at once is, in every sense of the word, a rarity.
That being said, of course, it being a rarity and an edge case does not mean that it's not something that would be useful to address. A cooling-off period, as you refer to it, is of course possible to implement, but I'm not sure it's really all that necessary - if you look at the history of the way that Legobot previously did this, this was never an issue, and I suspect had I just not sent any notifications of the ongoing RfCs and only started sending messages regarding new ones, it wouldn't have ever come up as an issue either. Bundling FRS notifications in a run is definitely possible, although there's nothing to guarantee that a further run that same day wouldn't pick up a new RfC or GA nom, which would then send another message. Once again, the thing to bear in mind here is that the vast majority of the time, each run will consist of one RfC, maybe two at a push - nowhere near the number experienced this morning. Naypta ☺ | ✉ talk page | 12:07, 16 June 2020 (UTC)
PS: As to documentation, the specific bot code doesn't have explicit documentation, because it's not a library, but all the relevant bits of code are commented. Code for ybtools, which is the shared library used across all of the bot's tasks, is commented in standard Godoc style as it is a library, so its full documentation is available here. Naypta ☺ | ✉ talk page | 11:11, 16 June 2020 (UTC)
P.S. shouldn't this discussion be at Wikipedia_talk:Feedback_request_service or User_talk:Yapperbot? --David Tornheim (talk) 10:52, 16 June 2020 (UTC)
  • If you just bundled all the invites into a single section (possibly by detecting whether the last section on a user's talkpage is an existing recent notification, and adding a new notification to it) I think people would be 90% less annoyed. But people are making much too big a deal of this, if indeed it's just a startup phenomenon. EEng 13:51, 16 June 2020 (UTC)
    @EEng: That's one of the options I've mentioned above, yeah. I disagree that people are making too big a deal of it, though; it's important. Being a botop means being in a position of trust, by the very nature of running a bot, and I want people to feel that they can put that trust in me. If people feel I've broken that trust, that's a huge issue, so it is important to have these discussions - at least from my perspective. As I said at the ANI thread, bots are here to serve the community, not the other way around, and I want to make sure that mine works the way it's supposed to. Naypta ☺ | ✉ talk page | 13:56, 16 June 2020 (UTC)
    I can send you a whip with which to flagellate yourself. EEng 13:57, 16 June 2020 (UTC)
    @EEng: No flagellation intended, self or otherwise - it's not about going "oh no, aren't I awful", it's about going "okay, how can we make sure this doesn't happen again?" Naypta ☺ | ✉ talk page | 08:29, 17 June 2020 (UTC)
    Well some people derive, ya know, pleasure from that kind of thing. EEng 12:11, 17 June 2020 (UTC)
    I very much appreciate your saying that in that tone. --David Tornheim (talk) 23:25, 16 June 2020 (UTC)
  • @David Tornheim and Mathglot: Thanks David for moving the discussion over here. To give an update on some steps I've already taken to try and ensure a better distribution even if such a large list were ever to come up again for simultaneous sending:
    • The random number generator that selects which users are going to be invited to give feedback is now re-seeded every selection, rather than on every bot run, which should significantly increase the variety in users selected when a bot run has a lot of messages to process.
    • The bot now waits for five seconds between each invitation, to try and prevent people being spammed and unable to edit their own talk page from edit conflicts.
    • The number of messages being sent for each RfC/GA nom has been lowered - it was previously a random number between 15 and 25, now it's a random number between 5 and 15.
    • Some issues with storing the state of processing GA nominations have been fixed, which had previously caused problems with GA nominations sometimes going out twice.
    Hopefully these will all be helpful to ensure the bot works better! Naypta ☺ | ✉ talk page | 08:35, 17 June 2020 (UTC)
Thanks for the update. I haven't delved into the code, except a brief look at this code you referred to. (What language is it written in?) Responding to some of the points:
The random number generator is re-seeded every selection, rather than on every bot run.
That seems odd to me that that would make any significant difference. The only possible concern I would have is if the same process started with the exact same seed. (I would only keep the seed the same if I were trying to debug it.) Are you using the clock for the seed? That's how I used to do it, but I was told that even with the clock, patterns can still emerge. Maybe there are new techniques for dealing with seed that were not available many years ago.
The number of messages being sent for each RfC/GA nom has been lowered - it was previously a random number between 15 and 25, now it's a random number between 5 and 15.
That sounds way too low to me. I've seen RfC's with as many as 200 responses. Do you mean 15-25/day has been reduced to 5-15/day?
Did you get those numbers from LegoBot?
Have you looked at the LegoBot code to see how it works? Do you know if it is available for pubic review too?
...[problems with] GA nominations have been fixed...sometimes going out twice.
Glad to hear it!
Is there a location where other users reviewed your code before it was released? If I find it, I'll delete this, or tell others where that is.
--David Tornheim (talk) 10:32, 17 June 2020 (UTC)
@David Tornheim: I'm using the clock for the seed, yes - the reason I changed it to every selection is to improve randomisation on runs that have huge numbers of messages, like the one we had yesterday morning. As the RNG was only seeded once, at the start of the process, with the timestamp at the start, the random permutation algorithm was returning the same permutations throughout - because it had the same seed - meaning that a small number of people received a very large number of messages, because their usernames were sorted to the front of the queue by the permutation generator. Now that it's reseeding the generator every selection, it will perform a new random selection every time.
The Legobot code is available for public review over here, but it's not exactly well-documented, to say the least - I'll freely admit that my eyes gloss over a bit when I see things like $temp01, $temp02 and foreach ($temp02 as $temp2) {. I had difficulty gleaning much at all from it, so no, I didn't get the numbers from Legobot - but I can adjust them as necessary, if the feeling is that they ought to be different. I might end up putting them on-wiki, so it doesn't take a code change to adjust them as needed.
As to code review, the code was available for review during the BRFA, but I don't know if it was actually reviewed by anyone - I don't imagine that it was, as it would make BAG members' lives even more difficult having to review entire codebases in languages they might not be familiar with to approve bots. The trial run over there went smoothly, because it was dealing with a smaller volume of messages to send, so there wasn't really an opportunity for this sort of an issue to arise. Naypta ☺ | ✉ talk page | 10:36, 17 June 2020 (UTC)
That's really too bad if no one reviewed it. It's troubling to me the possibility that code of this importance was not reviewed by at least one long-term experienced bot coder--ideally someone who had been here close to the start of the project. If truly no one reviewed it, I applaud your bravery at putting yourself out there like this, and give you way more slack than I might have from the beginning. No individual should have that level of responsibility.
I had assumed that all bot code had to be reviewed by a number of coders--maybe that was the case in the past.
How much time did you spend analyzing this kind of data:
1. the average number of new RfC's launched per month
2. " " per category
3. the typical number of responses to an RfC?
Do you have some data that you compiled and analyzed?
It sounds like the tests you did were insufficient.
Maybe we can work on some better tests and data analysis to be sure it is doing what editors expect and the input/output ratio makes sense. If we can figure out what LegoBot did, that might be the best, since the challenges we are seeing now may have been worked out already over the years the bot was working.
I'm going to try to do some more research on how Legbot handled the RfCs. Did anyone assist you at all in working on the code?
I see programming language is Golang.
I'm a little unclear on exactly where Yapperbot is located. I think you said: ytbtools
If that's the case, where is the entry point(s)? I hope you can be patient with me. This may be obvious to others... --David Tornheim (talk) 11:10, 17 June 2020 (UTC)
P.S. After writing the above, I see that three editors commented at Wikipedia:Bots/Requests for approval/Yapperbot. I'm surprised they are not here commenting and making suggestions on how to move forward. I think we should invite them at some point, but hopefully they will find there way here on their own. I think those who reviewed and approved this may share some responsibility for the problems and glitches that could have been caught. --David Tornheim (talk) 11:10, 17 June 2020 (UTC)
@David Tornheim: At the end of the day, this is a volunteer project, and whilst MediaWiki has paid developers, neither I nor the BAG, nor any other user, has any financial incentive for building bots - we do it to make the encyclopedia better. Imposing a requirement for code review makes total sense when committing stuff to the core repositories for MediaWiki, and indeed is used for MediaWiki extensions, but for a bot - something that runs off a single user account, can be easily blocked if need be, and is still subject to many (albeit not all) of the same limits that other users would have - it's just not a requirement. To the best of my knowledge, it's never been one.
I don't have a formal analysis of the number of RfC responses, no; I of course have anecdotal experience, but I don't have a specific report to show you. If this was a new bot task, I might have considered doing such a report, but as it's just taking over from what a bot was doing previously, replicating the functionality without adding anything new, my feeling (and evidently the feeling of the BAG) was that such a requirement would have been unnecessary. This is, again, the difference between an enwiki bot and commercial code: this is a passion project, not something where there's a list of deliverables, a timeframe and some set objectives.
I'm the sole author of all of the code - there have been no other contributors, pull requests or specific code suggestions made, with exception to some help I had with some of the regexes from some very helpful people on #regex on Freenode. ybtools is the library that powers all of Yapperbot's tasks, including the FRS one; the specific FRS code is here. The entry point in that code is main() in main.go, as is standard for Golang code. Naypta ☺ | ✉ talk page | 11:19, 17 June 2020 (UTC)
I wrote the below response while you were post the above. I haven't had a chance to read it yet. Need to call it a night... --David Tornheim (talk) 11:55, 17 June 2020 (UTC)
@Naypta: I'm back. Thanks for the two responses of 11:19 [above] and 12:05, 17 June 2020 [below]. I will start responding to those shortly. Are there any updates other than #Moved_from_kill_switch below? After I reply to you, I have some ideas of work I can do that hopefully we all can find helpful in making sure this bot's handling of RfC's is the best it can be in the short term and long term.
Having not been involved in the code, I think I can help others (especially those who don't have strong programming skills) understand how it works, how it might be improved, the parameters both internal limits and parameters that can be input by users (or admins) to adjust function. I propose tests that can be performed to assure desirable function. I do intend to help collect data on RfC's. I think that is important to seeing if the parameters you established are reasonable/desirable. --David Tornheim (talk) 01:54, 18 June 2020 (UTC)

Importance of Bot

@Naypta: In response to this comment immediately about how this is a volunteer project and we are not writing code for money here. Yes, obviously true. You seem to imply that because of this, we should lower our standards and shouldn't expect vigorous code review and/or testing. On that point, I disgree.

Despite being a volunteer project, this is one of the most important places people go for their information. The primary reason I edit is that I want to contribute and do my part so that the information we deliver is what is reflected in reliable sources and does meet our rigorous standards like WP:NPOV.

As for the importance of accuracy in bot coding/functionality/testing, especially RfC notifictions, I think it is very important: RfC's are a good way to invite non-involved users to help address difficult content disputes and to help with WP:NPOV and WP:OWNERSHIP issues. If it wasn't so important, I wouldn't be devoting so much time to helping with getting this bot working satisfactorily. I have been on this project for 12 years, and have not taken any serious interest in the functionality of these bots and how they are coded, until I eventually discovered that the reason I stopped getting WP:FRS notices is because the bot has been broken. That is certainly a problem for the project and I am very pleased you are working on it. Again thank you! --David Tornheim (talk) 05:11, 18 June 2020 (UTC)

@David Tornheim: I don't wish to suggest that the quality of our code should be lower as a consequence of it being a volunteer project; I merely mean to say that people have less time to commit to it, and requiring a formal process of code review would make innovation incredibly difficult, as a result of nobody having the time to deal with it. I'm writing code in Golang here, I've used a library for the core MediaWiki communication, but for the majority of the other parts of the bot, the code is all mine in raw Golang. It would be unreasonable to expect BAG members to learn Golang specifically so they could critique my bot, and it would also be unreasonable in my view to say "if you want to develop a bot, you have to do it in PHP". Code review on MediaWiki itself works because MediaWiki is (close enough to) a single codebase, and projects in Gerrit are generally those that have official Foundation support or have significant enough perpetual community support to be able to go through those code review processes easily. This model doesn't work for bots - not least because, as we've talked about below, there is no requirement for code to be open. Naypta ☺ | ✉ talk page | 08:19, 18 June 2020 (UTC)

Legobot

Legobot's code

@Naypta: I just looked at the LegoBot code you referred me to: Legobot code. It actually looks pretty straight forward to me--far easier to read than I expected. I see it does a per day calculation:

70: $time = 30/$frsl_limit;

That's how it avoided the problem you ran into. I don't see any limits to how many notices go out per RfC. I do see that it uses an SQL database to make it efficient. It looks like it cycles through every single RfC and every single user who wants notification and there is nothing random about it. I believe it focuses on how much time is left before user can be notified again. I believe it is something like this: (for the "all" RfC case)

initialize MinTime = maximum time to wait before bot runs again (e.g. 1 day)
For each user ($u) {
For each rfc ($r) { /* starting with oldest first */
if ($u has not been notified of $r -and- $u has not reached notification limit yet) {
notify $u of $r
tell database that $u has been notified of $r
}
Calculate delta time $t required before $u can receive next notice.
If $t < MinTime then MinTime = $t
}
}

The bot would wait the larger of $t -and- some min. increment (e.g. 10 seconds) before bot cycles through these again.

That was for the general case of getting all RfCs. Another third loop for each category $c would handle that issue.

To be efficient it could keep track of how much time is left before each $u can receive another notification in each category $c.

Whichever user can next receive a notification is when bot would do the query again.

Are you familiar with SQL? I recently took a Database course, so I know it.

Because it is database, a properly designed query should take care of both loops, returning all relevant records.

Anyway, I believe I can figure out how it works and document it. I will be very curious to see how things are similar or different than your code. --David Tornheim (talk) 11:55, 17 June 2020 (UTC)

@David Tornheim: Yes, I'm familiar with SQL. It was a deliberate design decision not to use a database for any part of Yapperbot. I didn't want to create a scenario similar to that we have with Legobot, where I might be unavailable to fix it, and things start going wrong, with nobody else able to sort it out, or understand why it's wrong. To that end, all of the data is stored on-wiki, with sole exception to a single file that just stores the timestamp and ID of the last GA nomination (this file is automatically created, so you could replicate it on your own PC anyway).
I came to the same conclusion when evaluating Legobot's code that you did - that there was nothing at all random to do with it - but that completely contrasts with not only the statements made on-wiki about what Legobot did, but also with the code itself - you see references to random user selection in the variable names. I can only suppose that the code that is available on GitHub is not the actual code that was running on Legobot, which is part of the reason I didn't waste too much time looking over the code and documenting every part of it. The other part of that decision was that I wanted to ensure that the design of Yapperbot was independent of that of Legobot, as there had previously been many issues with Legobot even when it was operational, with some even calling for it to be blocked for the way that it handled the FRS. (Incidentally, I'm currently writing code to prevent this exact issue - allowing individual Yapperbot tasks to be blocked individually by administrators.)
Keeping a particular timeframe between informing users is one way of dealing with the issue, I can appreciate that - but on the other hand, that does potentially negatively impact on users who are more active and genuinely want to receive more notifications. It would also result in a much larger JSON file stored on-wiki, so it's something to bear in mind. Naypta ☺ | ✉ talk page | 12:03, 17 June 2020 (UTC)
I'm back [1]. Will respond to these soon. --David Tornheim (talk) 01:58, 18 June 2020 (UTC)

Is Legobot code private?

@Naypta: Above you had said: I can only suppose that the code that is available on GitHub is not the actual code that was running on Legobot, which is part of the reason I didn't waste too much time looking over the code and documenting every part of it.

This statement is deeply concerning to me. From what you are saying, it sounds to me that you are saying we don't have access to the actual code that is running Legobot and therefore we are not able to change it, and that only the author of the code can. If true, that to me is a big problem.

I don't think bots that can have huge impacts on the project and could make 1000s of edits in seconds should be private.

Also, what if the coder suddenly disappears? What if the code has biases built in that we are not aware of because we can't see and analyze the code?

I didn't reliaze the bots are basically like regular accounts that can be blocked, but are controlled by software.

The potential that the code or data the bot uses might be private and could be changed without our knowledge is very concerning. It reminds me of all the problems with Black box voting and proprietary software. Richard_Stallman gives his two cents on that.

If the code for some bots really is proprietary I believe that would need to be discussed else, and probably already has. If anyone wants to point me to those discussion(s), I'm all ears.

But I do want to confirm: Is it true that we really don't know for sure what code Legobot is running on?

--David Tornheim (talk) 05:20, 18 June 2020 (UTC) Wow.

Authors of bot processes are encouraged, but not required, to publish the source code of their bot. WP:BOTCONFIG Wow. --David Tornheim (talk) 07:53, 18 June 2020 (UTC)

@David Tornheim: Well, technically speaking, nobody but myself and the admins of Wikimedia Toolforge knows for sure what code I'm running Yapperbot on I can tell you, and tell you truthfully, that the exact code you find on GitHub is what is running on Toolforge, but there's no independent guarantee of that of course, apart from going and asking a Toolforge admin. I've chosen to host Yapperbot on Toolforge, not only because it provides a free environment for hosting tools beneficial to the Movement, but also specifically because it means that if I'm not around for a long time and things break, someone else can go and request maintainer status on the tool to take over from me. Using Toolforge for a bot, however, is very much not required.
Bots on Wikipedia aren't running elections - they're sending messages, reverting vandalism and handling minor cleanup tasks. I like free software as much as the next person, and I strongly believe that bot operators should make their bot code public, but I don't think it should be that they must do so. Naypta ☺ | ✉ talk page | 08:15, 18 June 2020 (UTC)
LOL. Well I appreciate your sentiment of thinking this should be public. In the distant past I worked at big and also small corporations that maintained code for hardware and software. It would be absolutely unacceptable at those businesses for a regular employee to provide an executable that the company heavily relied on for which no source code was available. If they gave two week notice, the first thing they would need to do is document all their software.
Bringing the company's valuable software home with you--even if you created it--is technically stealing company work-product and could land one in court or even jail. (I'm assuming you know that.) I remember cases against former Intel & IBM employees, and FBI raids. I was a first hand witness to a section head who was fired by a superior who had previously worked for IBM. The superior said he used IBM's method to deal with the firing: The superior first crashed all the computers, then invited the employee who was to be fired to his office to tell him bye-bye, and then had the locksmith change the lock on the employee's door during the meeting!
This was all to keep the employee from either taking, deleting, or changing any software. It was something else to experience. I will never forget it.
I shared this story so you might understand just how shocked I am that wikipedia makes no serious effort to assure bot software it relies on doesn't disappear with a volunteer...as happened with this fairly important bot...
That you have had to recreate the bot from scratch, because you don't have access to the code, is just as unbelievable. I do believe you. --David Tornheim (talk) 09:05, 18 June 2020 (UTC)
@David Tornheim: I understand what you're saying, for sure - but this isn't IBM, it's Wikipedia We're very proud of being volunteers, of not being a corporation - we're here because we want to be. If some bot the community uses breaks, someone else can pick it up and either fix it, if its code is available, or build a new one if it's not. Again, I fully support bots being FOSS - hence all parts of Yapperbot are open source and licensed under GPLv3 - but I'm not sure it's reasonable to demand that volunteers give up their IP in that way.
As to recreating the bot from scratch, it wasn't actually lack of access to the code that made me take that decision primarily; I could have taken the exact code on GitHub, tried running it, and just gone from whatever happened when I did that. There were two primary motivating factors for rebuilding it completely: removing the reliance on an external database, to make sure that the bot was as open and transparent as possible, and changing from PHP to... almost anything else Naypta ☺ | ✉ talk page | 09:12, 18 June 2020 (UTC)
Well, Wikipedia probably is more important and has greater impact and visibility than those private entities, so the stakes are higher.  ;) I'll take the comments about intellectual property under advisement--I think that is better discussed at the policy page. I believe I'm done discussing this for tonight. I hope you don't mind that I sort of volunteered you to take over Legobot. I won't feel at all hurt if you don't want to, and I am happy revise that suggestion if no one has responded... --David Tornheim (talk) 09:34, 18 June 2020 (UTC)
(edit conflict) I followed this up by posting here: Wikipedia_talk:Bot_policy#Bot_code_--_can_remain_proprietary??? --David Tornheim (talk) 08:15, 18 June 2020 (UTC)

Legobot Functionality

Investigation by DT

@Naypta:, @Mathglot: I have begun looking to see how Legobot handled notifications while it was operating based on its edits (function) rather than its code.

Method: I choose a random date [and possibly time] during the period Legobot did these notifications: 9/16/2013 through 12/3/2019. I investigate a single RfC from start to finish to see what Legobot did with it. I am also looking to see what percentage of the editors who were notified chose to comment at the RfC.

First Review: The first random date was 12/17/2016. I decided to start at 00:00, rather than the randomly selected time of 20:18. The first RfC that is addressed is opened at 12/16/2016 15:22 and closed 1/21/2017 18:06.

Observations: We had been under the impression that Legobot notified everyone who is eligible to receive a notification of that RfC. However, that does not appear to be how it works. Instead, it appears to notify one new editor (or no editor) every 24 hours for about one month before the RfC expires: 12/17/2016 04:23, 12/18/2016 04:23, 12/19/2016 04:26, 12/20/2016 04:23--skipped, 12/21/2016 04:30, ..., 1/10/2017 04:23, 1/11/2017--skipped, 1/12/2017--skipped, 1/13/2017--skipped, 1/14/2017--skipped, 1/15/2017 04:23--skipped, expired 1/15/2017 16:01.

Legot also updates a file called "Archive Index" for the relevant talk page showing approximately how many people comment in each section of the talk page,even after RfC is expired: 1/14/2017 83 -> 84, ..., 1/20/2017 85->87 1/21/2017 87->90 [I haven't had time to proofread this carefully, but I think most of it is correct. Will like fix any typos and do some copyediting.] --David Tornheim (talk) 16:03, 9 July 2020 (UTC)

Functionality

Wow, a lot has happened. I don't even know the right place to respond, so adding this artificial break. Feel free to refactor, retitle this from "break" to something better, or to move it to a more logical place. (I'm waiving TPO, so go for it.)

Your point #2 of shipping multiple Rfcs in one post would be acceptable, because it eliminates locking me out of my page, but it's not optimal, because I'm simply not going to respond to 30 Rfcs; it would be better to space them out, meaning in practice, I won't get all 30 of them, and most will be reassigned to something else, which is exactly what ought to happen, if there's any hope at all that someone will address them.

We shouldn't lose focus of what this is all about: this is not about a coming up with a schedule for a bot dropping cookies or kittens or brownies on user talk pages, and then it's done, and how many kittens are too many. There's a point to the messages that the bot is writing: namely, to encourage a user to go to the linked page, and spend the amount of time required to respond to an Rfc that needs eyeballs. That is the goal here. Working out whether the "bot is doing what you told it to" on the signup page, is cop-out. We all know why people are signing up, and we're volunteers here, so don't insult our intelligence. (This is not directed at you, but at a snarky comment on the original page, that elicited only my disdain for software engineers that have spent too much time in the office, and are disconnected from people, and reality. I'm an ex-sw engineer, and I recognize the disease. Let's chalk it up to covid-cabiin-psychosis, and give them a pass this time.)

I'm not going to respond to all the stuff about different seed and other individual tweaks, because I have no way of knowing how that will alter things next time around. The goal I'm looking for, functionally, is that if I've signed up for 60 notices a month, that works out to one every 12 hours; therefore, after Yapperbot sends the first one to me, it must stay off my page for 12 hours. I don't know the internals of either the tables or the bot (I know the code is open, but I'm trying to stay away) but I know if I were designing it, I'd do a first pass of the signup page to build an in-memory hash with {$user -> [$average_delay, $last_notif_timestamp] } (where $average_delay holds "12 hours" and ts=0 after startup pre-scan in my case, in whatever units are convenient), and each time through the main loop, you check if $time_now > $last + $delay, and if it isn't, you skip to the next user, and if it is, you do your thing and update $last. If you crash or cold start, then you'd send me one notification right away, and no other for around 12 hours, right? (This won't fix the case where your program is crash-restarting every few minutes, but I'm assuming that you don't auto-restart after a crash? Extra credit: read my TP after a crash, find your last notif, set $last from that.) If that doesn't work, then every time you write me an Rfc, you first read my Talk page, to make sure at write time that you don't wipe out someone else via an edit conflict, right? So, use your Rfc notif on my page as a cookie: read back the timestamp of the last Notif you wrote, and calculate the delay that way. Mathglot (talk) 03:25, 18 June 2020 (UTC)

Oh yeah: and a shout-out to "not an edge case, but core functionality". +10. Mathglot (talk) 03:33, 18 June 2020 (UTC)

@Mathglot: I basically agree with everything you wrote. Do you agree with me that there should be no limit as to the number of people who receive a notice for a particular RfC? I believe Legobot had no limit. --David Tornheim (talk) 04:16, 18 June 2020 (UTC)
A big question for me is how it is supposed to work when there are more RfC's per month than the user requests (that also applies equally to # of RfC/month for a particular category). Which notification would they not get, especially after cold start? For the case where the expected # of RfCs per month is less than the user limit, then one would start with the oldest first until it is caught up. But the reverse might be true for for someone who only wanted 1 RfC per month. Any thoughts on that?
I haven't looked closely enough at the new code and data stored to see how we know if the user had or had not been informed of a particular RfC. There are other RfC attributes of importance for helping to answer and address the questions I raised in the above paragraph:
  1. Is it open?
  2. When was it opened?
  3. How long has it been open? [easily calculated from 2.]
  4. How many editors have been notified?
  5. How many editors have responded?
Items 1, 2, and 3 seem essential to bot functionality. Calculating 4 and 5 might be more difficult. Are these being calculated and used? --David Tornheim (talk) 04:18, 18 June 2020 (UTC)
@David Tornheim and Mathglot: To try and give a brief answer to all of the technical questions that have been raised here:
1: The bot detects whether an RfC is open or not based on the presence or absence of the {{Rfc}} tag, which is removed from RfCs that are no longer open.
2 and 3: When the RfC was opened absolutely is not taken into account, only relatively. The bot scans for RfCs, detects those that have been opened since it was last run (i.e. those that haven't yet had a message sent), and sends the messages for them.
4: This is easily technically possible, but isn't relevant in the current implementation.
5: This is more difficult in the current implementation, and nor is it relevant to it.
In the current implementation, the list of applicable users for an RfC is randomly sorted, then users are selected from the start of the list in order - which is a way of performing random selection of the users. If a user has exceeded their limit, the next user in the (random) list is chosen. If the entire list is exhausted, it just sends to the number of valid contacts it's reached at that point.
To reply to some of Mathglot's concerns in particular - because of golang's defer functionality, a standard panic should cause cleanup calls to be sent, meaning that the relevant JSON updates would be stored for everything the bot has done anyway; a full on segfault might not do that, but it wouldn't auto-restart then until the next time it was scheduled. Logs are stored on Toolforge for it.
I'm not sure it's a good idea for the three of us to set the specifics of what is, at the end of the day, a decision as to what the FRS is rather than how it is implemented - that is, vis-á-vis the number of messages sent per day, and whether users are selected at random or simply all are sent. However, I recognise that two people expressing concern is, in and of itself, significant. For that reason, I intend on creating a survey of FRS subscribers, to see what people specifically want. I will write this up in neutral terms, and once it has been written, I will advertise it both through a mass message to FRS subscribers and at the relevant Village Pump. This will address several of the issues here, along with some other potential changes that have been suggested. I hope that's something we can all get behind
Final thing for this message (and sorry that it's become a slight sermon!) - I want to address the "edge case" thing. My name is double-barrelled. I'm a living edge case! It's not a way of saying "oh, this doesn't matter because it rarely happens", it's just saying "this isn't the normal operational condition". I find it frustrating when organisations can't work with my name because of the hyphen, just like if you're caught up in this edge case it's frustrating - and I don't want to minimise that, so apologies if that's the way it came across. I'm talking specifically from a technical perspective when I use the term "edge case". Naypta ☺ | ✉ talk page | 08:10, 18 June 2020 (UTC)
Just one point, which David raised, and Naypta kind of answered the way I would wish, namely:
Q: Do you agree with me that there should be no limit as to the number of people who receive a notice for a particular RfC?
Without being coy, I don't want to respond to that here, because it seems to me out of scope, and because it could tend to lead to a more open-ended discussion about all the functionality of the bot, including, polling people what they want the FRS to be or to do. That is far beyond the scope of how I view my involvement, which is basically as a bug-report on (what I see as) the existing functionality, not developing a new conception of what the bot should do across the board, or how it should be designed. From my viewpoint, the current (implied) specification is fine: the bot delivers Rfc notices periodically to my Talk page, obeying the rules implied in the sign-up page. To me, "implied in the sign-up page" (since it's not a formal spec) includes not delivering more than two a day in my case; anything else is a bug; to others, "it's what you signed up for".
Naypta, it sounds like you are agreeing with my basic conception, and I wouldn't poll anybody about what I view as a bug, but if you do go poll the group about anything, I would only beg that you keep it to the most focused and targeted question possible, to avoid opening up a free-ranging discussion that could open up a can of worms. (Nothing wrong with that for later; I just consider that a different scope, deserving of a different discussion.) If you do decide to, the wording of such a question could be a little tricky, but imho it should be restricted to what I believe we are all talking about here: "what is the right frequency (or maximum frequency) of comments that should be posted to your Talk page, if you have signed up for N Rfcs per month?" You can add an example, if that makes it any clearer: "For example, if all your signups could theoretically produce 30 Rfc invites on your Talk page over the course of a month, when the bot starts up fresh from scratch after a long outage, how many should it post the first day?" (My answer: "1".) It's a little tricky how best to phrase this, so it really covers the information you want, without becoming too opaque to the user. Maybe you an come up with better phrasing. Or, you could just treat it as a "bug fix", put in the frequency throttle yourself and don't survey anybody, and see who complains. I fully expect nobody would complain, and practically speaking, that is probably the least time-consuming approach. It's late and I'm a little fuzzy, but I hope this helps. Mathglot (talk) 08:38, 18 June 2020 (UTC)
P.S. re "edge case": my spouse has a hyphen, I don't, but I do have weird consonant bigrams that flummox most everyone. I'm not so much an edge case, as a vertex, or a point at infinity or something... Mathglot (talk) 08:40, 18 June 2020 (UTC)
@Mathglot: Heh, I like the description of someone as a "point at infinity" - I think we'll get along nicely with those sorts of descriptions As to the actual substantive issue, I've been thinking about this for a little while. I'm not opposed to limiting the number sent to one a day, in any way - so I did think about just implementing it straight up. The problem with that, I realised - and the reason I'm going to add this to a survey - is that it's not clear what happens when everyone on the list has been asked, then. At the moment, there are some categories - especially for GA nominations - that are very small on the main FRS page, very few editors subscribe to them. If a GA nom comes in which fits into one of those categories, say with three users in: let's say User 1 has already had a message today, user 2 has already had a message today, and user 3 has reached their total limit. What happens? Do we delay sending the message until tomorrow (no function to do that in the current implementation, but could be written in)? Do we ignore that GA nom completely? Do we do something totally different? This then becomes a more "what does the FRS look like" question than a "technical implementation" question, which is why I think it needs to be put out for broader consensus.
As to specificity, I agree - what I'm going to do is create a page under WP:FRS with some (probably around six to ten) specific questions, each with a survey subsection and a discussion subsection. Hopefully people will discuss those questions sensibly! Naypta ☺ | ✉ talk page | 08:52, 18 June 2020 (UTC)
Please understand that in saying "one a day" I am not suggesting or implying some sort of artificial threshold or rule of thumb that applies to everybody. Rather, I was saying it should be derived per user. I.e., if someone signs up for a possible 30 a month total, then that averages to one a day. If someone else signed up for 120 a month, that would be a max of one every six hours, in their case. If a universal limit were much easier to implement, and I can see that it might be, then that might be another approach. That could be a question worth putting to a larger group. If it isn't harder to implement, I'd still like to see the derived figure; sign up for ten a month: you get a max of one every three days. Sign up for 720 a month, then you get a max of one an hour. Maybe that's too hard. Mathglot (talk) 09:03, 18 June 2020 (UTC)
@Mathglot: Nope, that'd be absolutely possible to implement - but it might also not be how people want to do it, because the FRS is split into categories. Someone might want a maximum of one message per day for a GA nomination about tech, but nine a day about something else. This is another point I'll include in the user survey, to try and see what people actually want it to do - one outcome might be to have an automatic calculation of a derived figure, with an optional {{Frs user}} param to set a specific figure instead. Very few things are off the table on the basis that they're "too difficult" technically to do - I just want to make sure that the time and energy invested in doing the things are worth it for the demand in the community for the feature Naypta ☺ | ✉ talk page | 09:07, 18 June 2020 (UTC)
Good point, didn't think of that; that is even better. And yes; definitely don't want to waste time doing something nobody is interested in, or would make it worse for them. Thanks for considering all the wrinkles. Mathglot (talk) 09:31, 18 June 2020 (UTC)
I will respond to implementation/functionality later today. --David Tornheim (talk) 18:57, 18 June 2020 (UTC)

Random selection

Rather than randomly reordering the list for each RfC, which could result in one person being selected several times in a row, have you considered "load balancing" the list by randomizing it once, and then going through it sequentially until everyone has been assigned a case? Once the list is exhausted, a new list could be generated. -- Norvy (talk) 19:49, 18 June 2020 (UTC)

@Norvy: Good suggestion! There's a few problems with doing it this way: firstly, it hugely delays the timeframe for new users being incorporated into the list, as it'd mean that they would need to wait for the list to be exhausted before they would even be considered. Secondly, it could cause issues if users unsubscribe, as they might keep receiving messages. Finally, it introduces a new moving part; rather than just taking the list as is from the wikitext, it'd then need to have a separate list of users, that the bot would have to maintain.
That being said, neither of those issues is insurmountable; what might be a good idea would be to have the bot construct a list of applicable users on run 1, sort it randomly, store it in a JSON file, and then have each successive run running off that list. Each run would check at the start whether the list had been updated; if it had, it would re-create the list. If there were no changes, it could then use that method to evenly distribute the messages sent, whilst keeping the random sorting. What do you reckon about that? Also copying in David Tornheim and Mathglot who I suspect will be interested in this. Naypta ☺ | ✉ talk page | 20:16, 18 June 2020 (UTC)
Why not just have any updates to WP:FRS trigger a bot that updates the JSON file? Or just check the WP:FRS file once per day and put notice that says it will take 24 hours to update. Maybe it isn't a big deal to check if the file has been updated every time the bot is run, so your implementation might be fine too. --David Tornheim (talk) 20:47, 18 June 2020 (UTC)
@David Tornheim: That could be doable, but what's the advantage of doing it that way over the way that I suggested? It seems like the way I've proposed would involve a lot less legwork, rather than having to arbitrarily monitor every edit to the page as it's made. Unless you mean the same thing I've suggested, just without a full random shuffle when the list is updated - but without that, my concern is that we're shifting slowly further towards a separate user list, rather than keeping the user list canonically stored on the FRS page. In effect, I want to make sure that issues are preventable and fixable by as many people as possible, all the time. Naypta ☺ | ✉ talk page | 20:47, 18 June 2020 (UTC)
This was posted before I saw you had updated your comment, for what it's worth. Naypta ☺ | ✉ talk page | 20:48, 18 June 2020 (UTC)
How and where is the number of times a user has received a notice for the current month stored? -- Norvy (talk) 21:47, 18 June 2020 (UTC)
@Norvy: All data is stored on-wiki; the number of notices for each user is stored at User:Yapperbot/FRS/SentCount.json. Naypta ☺ | ✉ talk page | 21:48, 18 June 2020 (UTC)

Loops

My issue is your choice of outer loop:

MaxNotificationsPerRfC = 5 to 15
For each rfc ($r) {
For $i = 1 to MaxNotificationsPerRfC {
user $u = randomly selected user who has not reached notification limit -and-
who has not yet been notified of $r
notify $u of $r
}
}

I really don't like the idea of randomly choosing who will be notified based on some arbitrary number of notifications per RfC chosen by the program/programmer. This approach has too many possibilities of bias. It also has the strong possibility that users who have capacity to receive many notification won't get them.

I believe the outer loop should be the user, not rfc, like this:

For each user ($u) {
For each category ($c) {
If it is too soon for $u to receive notification in $c, then skip to next $c.
For each rfc ($r) in category ($c) { /* starting with oldest first, or chosen randomly from the list of open RfCs */
if ($u has been notified of $r) skip to next $r
else {
notify notify $u of $r
$t = soonest user can receive another notification in $c.
break; (i.e. skip to next $u)
}
}
}
}
run bot again for min. $t found.

When a user is available to receive an RfC and there is more than one RfC available that they haven't been notified of, then I don't mind if the choice of the RfC to be notified is random, although I think it is better to choose the oldest (or newest first). I just don't see much need to use random selection. --David Tornheim (talk) 21:41, 18 June 2020 (UTC)

@David Tornheim: This isn't a code issue, this is a "what is the FRS" issue, as we've discussed earlier. Your proposed loop is just an implementation of what you think the FRS should be - with everyone receiving all messages all the time that they can, rather than having any random element involved. That's fine, and a valid viewpoint, but it's different to what the FRS is documented to be - hence why it's the first question in the proposed survey Naypta ☺ | ✉ talk page | 21:44, 18 June 2020 (UTC)
Okay, I see you are correct that the spec. for WP:FRS says:
By signing up in the section of your interest, you may be randomly selected to receive a user talk page invitation to participate in a discussion in that topic area.
It appears this language has been in place since the page's inception in 2011.
I will try to collect more data so that if your outer loop stays, the max. # of people selected to be notified is sufficient that editors are being notified close to capacity rather than way under. Consider for example that this RfC received 522 responses. I'm going to find out if the Legobot did notification on it, and, if so, how many received notification. --David Tornheim (talk) 22:13, 18 June 2020 (UTC)

Category Questions

Category Questions (set 1)

@Naypta: I have created tables for example situations to ask you questions about how Yapperbot handles particular ambiguous situations. It's never been clear to me what Legobot or Yapperbot is supposed to do in these situations. No need to tell me what you think Legobot did, I'm only interested in Yapperbot's implementation.

Example RfCs
RfC type RfC belongs to Category X? RfC Belongs to Category Y? RfC Belongs to Category Z?
RfC type X Yes No No
RfC type Y No Yes No
RfC type Z No No Yes
RfC type XY Yes Yes No
Example editor configurations
Editor Name # of notices/mo. in Category X # of notices/mo. in Category Y # of notices/mo. in ALL categories
Editor1 1 1 1
Editor2 1 1 0

Situation 1: It is the beginning of the month. Yapperbot is to process an RfC of type X.

Question 1: In Situation 1, would Editor1 and Editor2 have exactly the same probability of receiving a notification of RfC type X?

Question 2: For Situation 1, let's say that Editor1 did receive a notification of RfC type X, and another RfC of type X comes in. Is Editor1 maxed out for the month for RfC's of type X, or can Editor1 receive the notification since Editor1 is permitted 1 notification for "all categories"?

Question 3: For Situation 1, let's say that Editor1 did receive a notification of RfC type X, and then an RfC of type Z comes in. Is Editor1 maxed out for the month, or can Editor1 receive the notification of Z because Editor1 is permitted 1 notification for "all categories"?

I hope this is not too confusing. --David Tornheim (talk) 11:21, 7 July 2020 (UTC)

@David Tornheim: No worries!
  1. No. In Situation 1, Editor1 is more likely to receive the message, by an extra weight of 0.5. Each subscription is treated in isolation. That means that Editor1 will have a "weighting" of 1.5, with 1 because of the exact category match and the 0.5 from their subscription to the All category, whereas Editor2 will have a "weighting" of just 1, from the exact category match.
  2. As each subscription is treated in isolation, Editor1 could receive the notification through their "all categories" subscription, although as mentioned above, the probability of them doing so is low if other users are available. That being said, it is possible, because the selection is random.
  3. Editor1 could receive the notification about Z if and only if the notification that they had received about RfC type X was in their type X-specific subscription. That is to say, if the original type X RfC was sent through their "All categories" subscription, they would be considered maxed out for "All categories", but not for type X. Consequentially, they could get another type X, but not a type Z.
I hope that's helpful! Naypta ☺ | ✉ talk page | 11:57, 7 July 2020 (UTC)
Naypta Yes, that makes sense, and I thought that was roughly how you implemented it (except for answer to 3)--I wanted to be sure. That's similar to how I might interpret the relationship between All categories and each specific category.
These answers will help me understand your code.
I had another question, but I think I might be able to answer it by looking at the code. Thanks for the quick answer. --David Tornheim (talk) 12:24, 7 July 2020 (UTC)
Category Questions (set 2)

No question after all. I was able to find the answer. --David Tornheim (talk) 13:58, 8 July 2020 (UTC)

Heh, that's always nice to hear If I can help at all, David, please let me know! Naypta ☺ | ✉ talk page | 14:14, 8 July 2020 (UTC)
Thx. --David Tornheim (talk) 14:22, 8 July 2020 (UTC)

Data analysis

Here I am including sources for data and some initial data collected.

  • RfCs (open)
  • Approx. 84 unique pages with at least one open RfC (165 listed - 81 duplicates)
  • Note: Some pages have more than one open RfC, e.g.
  • Note: I could calculate a more exact number of open RfCs using the unique RfC id # as done for WP:RS/N above.
  • users
  • Approx. 700 unique users desiring notification. (anywhere from 1/month to unlimited)
  • Approx. 2200 entries (extra are users who have listed themselves in multiple categories)
* I see that when I collected the # of users, I accidentally included users wanting GA notification. I will need to recalculate for that.
  • notifications (that have gone out)

[to be continued]. --David Tornheim (talk) 09:13, 19 June 2020 (UTC)

Wow, these data are great. It's also completely independent of this discussion, and I'd hate to see it get lost. David, can you keep your message in mind, or the expanded version of it that I imagine you may have in offline notes, for possible inclusion as a stand-alone subpage at a more general location? This would be great information to keep in a hierarchical collection somewhere and collect periodically (.../Rfc_notif_stats/2020/June/, */July/, ...) and when someone gets ambitious, to make a time-series line graph out of. Mathglot (talk) 19:40, 21 June 2020 (UTC)
@Mathglot: Thanks for the kind words! Don't worry, we will make sure it does not get lost. Time series graph sounds exciting.  :) FYI. I have never created a graph on wiki--I have made very fancy ones in Excel! I have played with tables, e.g. User:David_Tornheim#Table
Note: I am not using sophisticated software to scrape the wiki pages for data. Alot of it is "by hand": copying it to Word, filtering things out with search/replace wildcards, transferring to Excel and using Excel's sophisticated tools for analyzing data in tables. I did that for the two key #'s above, but there is more data to be collected for other numbers.
Estimating the number of new RfC's per day (and per category) over time is going to be trickier. Even more so is estimating the # of users who have responded to the notifications.
However, these are some of the most important number to determine how effective the bot is with bringing user to RfCs. [I don't care so much about GA noms. Someone else can collect that data if they want. :)] --David Tornheim (talk) 03:00, 22 June 2020 (UTC)
@David Tornheim:,
  • Alot of it is "by hand"...
Yes, so I assumed. Hopefully, not inordinately involved, so that if done once a month, maybe not too burdensome, if volunteers would share the responsibility for it.
  • some of the most important number to determine how effective the bot is...
Absolutely, which is why it would be great to keep it up. Not suggesting you do it, just applauding the conception of it. Mathglot (talk) 03:07, 22 June 2020 (UTC)
@Mathglot: Hopefully, not inordinately involved, so that if done once a month, maybe not too burdensome. If I had to do it more than 3 times, I would probably end up writing a script or Python to process each page. Doing it multiple times right now is not too hard--while the steps and pages are fresh. Trying to get into a routine to recreate the data analysis once a month (even if I document every step) would be no fun. I would probably only be motivated to update it that far in the future, if I saw a pressing need. Such need might be a bug, ongoing discussion, or frustration, wherein new data is needed to answer pressing questions and old data is not sufficient. Right now, it seems easier just to collect as much relevant past data as I can and see if we can figure out how the current bot would process each pertinent unique situation.
That's where testing comes into play, or at least modelling. That's how we did hardware design, where we had to come up with every possible input combination that might screw up the processor BEFORE it was fabricated. Making the first batch on silicon was extremely expensive and time consuming, so you wanted the first batch to be bug free, and if not, the next batch for sure. And once the first batch came out: testing, testing, testing. I came to believe more resources were devoted to testing than to design. At least that's the feeling I got about what has been going on at Intel. This paper says Intel hired 60 recent grads devoted to testing the Pentium 4 processor and its new architecture.
Depending on how hard it was and how much demand there is for the data, I would consider making a program like WP:WikiBlame that creates the data when the user requests it. I would be far more excited to do that if someone (maybe you? or Naypta?) helped me with the Wikipedia-specific obstacles--especially how to do input/output from/to a wikipage. It does occur to me that I might be able to do that in HTML/CSS user box using the programming/scripting language I learned that goes with HTML/CSS, and I could simply require the user to copy the Wikitext to an input box, the program/script processes the input, and outputs it to another HTML page--like those programs that count word frequency, e.g. [2].
if volunteers would share the responsibility for it
Unless you volunteer for that, I seriously doubt anyone else would! LOL. --David Tornheim (talk) 04:44, 22 June 2020 (UTC)
@David Tornheim: I'm very happy to help with anyone who wants to do wikitech! Not quite sure that the stakes of a Wikipedia bot are quite the same as a multi-million dollar run of silicon chips being produced by one of the world's largest companies, though Naypta ☺ | ✉ talk page | 08:23, 22 June 2020 (UTC)
@Naypta: Thanks for your willingness to help!
First let me point out that:
Wikipedia may, in fact, have as significant of an impact on the world as Intel precisely because it is not focused on profit, and because it is run by people who don't come here for a paycheck: We take the work seriously, have high standards for accuracy, and want a quality product. New editors who don't adhere to our standards quickly discover their shoddy work is unapologetically reverted. If they interfere with the work of those trying to maintain high quality, they eventually get blocked for being disruptive.
Wikpedia is one of the most popular site in the world because readers see the benefit of articles written by people who are not trying to sell them something--unlike almost all other media that has some agenda related to making money. With 25 Billion views per month, Wikipedia/media exceeds countless for-profit companies and their products, such as Netflix, Instagram, and Bing.
I have more to say on this subject... which I will add to #Importance of Bot. --David Tornheim (talk) 11:26, 22 June 2020 (UTC)
With regard to your willingness to help, which I greatly appreciate: Here are some questions, in order of importance:
(1) How would I write a routine to determine how many new RfCs are launched on a particular day and get a link for each?*
(2) How would I write a routine to determine what categories each new RfC in (1) belongs to?*
(3) Once I have the data and processed it, how could I output it to some wikipage?*
* I know your bot does this or something like it, but I don't know where that code is. I know you use some "JRON" file or something like that to read and output some data. The I/O facilities for Wikipedia are completely foreign to me. I only know the user interface, and I have a vague sense of what bots like yours do internally, but not how to write the code that handles the key I/O aspect.
(4) Can I do it without a bot? Could I do it with a script for example?
(5) When bots re-execute every N hours, is that something that the user has to manually force to happen? Does your computer make it do that, or is there some process on Wikipedia that relaunches your bot based on some timer?
It looks to me like at least the first time, you launch your bot from Unix. I have only Windows 10. Does that complicate writing script or bots?
Or is there a way to store the code online and tell some process in the Cloud to run it? i.e. do you plop the code in a window like this and it can run it there?
--David Tornheim (talk) 11:26, 22 June 2020 (UTC)
@David Tornheim:
  1. Determining the categories of RfC would need to be done using a regex, to match the RfC tag and the categories associated with the tag on the page; you can see my implementation of that here.
  2. If you wanted to create a bot to output it to a page, you could use the Edit API, but for a use like this, I'd recommend just generating it on your desktop computer into a file or whatever; then, you could paste the relevant information wherever you like. This is especially true as any bot that's doing any editing beyond its own userspace would need BAG approval.
  3. See above - the answer is yes!
  4. I use Wikimedia Toolforge, a hosted environment offered by the Wikimedia Foundation for projects that help improve the wikis, to host Yapperbot. The relevant tasks are then submitted to a grid engine setup according to a schedule I set in a crontab on there. Toolforge uses Linux, but there's nothing to stop you interacting with the same APIs on Windows; I explicitly would not recommend that a beginner use Golang, however, especially if you have not used Golang before, as the vast majority of wiki scripts and bots are written in PHP or Python.
Naypta ☺ | ✉ talk page | 12:38, 22 June 2020 (UTC)
@Naypta: Thanks. This will also help me get a feel for both how your program works and how it might be possible for me to do the data analysis I was working on above. I believe that's everything I have to say about coding for tonight. --David Tornheim (talk) 14:17, 22 June 2020 (UTC)

Survey (drafting)

[This is mentioned a bit above. I will copy those relevant parts of conversation soon.--David Tornheim (talk) 18:41, 18 June 2020 (UTC)]

@Mathglot and David Tornheim: I've created Wikipedia:Feedback request service/2020 survey, which I intend on sending to FRS subscribers and publicising at the Village Pump, as mentioned earlier. I want to make sure you're both happy with the way I have worded the proposals before I do, though. Please let me know if you think it looks good, or if there are any changes you would like to be made (or equally, additional questions to ask!) Naypta ☺ | ✉ talk page | 10:48, 18 June 2020 (UTC)

@Naypta:. Great. I support the idea--I probably would have done it if you hadn't. Thx for giving us chance for input. Please give me 24 hours to reply & provide any proposed changes before launching. I strongly agree with Mathglot that if you do go poll the group about anything, I would only beg that you keep it to the most focused and targeted question possible, to avoid opening up a free-ranging discussion that could open up a can of worms. I have seen numerous poorly formed RfCs that made a disagreement worse and even more confusing, that were closed as no consensus. This is not so helpful at resolving the problems, moving forward with a new clearer direction. Often the worst problem is the question was too vague or made assumptions that respondents did not agree with and then argued about. Sometimes a yes/no is best. The less text users have to read the better. Links can be provided for discussion context. --David Tornheim (talk) 18:52, 18 June 2020 (UTC)
Yeah, I'm a bit slow, as well. Copla days? (David, thanks for new section header. I unindented, feel free to undo if you prefer.) Mathglot (talk) 19:01, 18 June 2020 (UTC)
Fine with me. --David Tornheim (talk) 03:21, 19 June 2020 (UTC)
So I guess my main reaction is what I alluded to earlier, it isn't sufficiently targeted on the issue at hand. The survey seems to open it up to discussion on several fronts. While there's nothing wrong with that in theory, I'd rather seen that happen at a time when the waters are calm and there are no special issues going on, so that the discussion can be as free-ranging and varied as desired, and take as long as needed; perhaps a multiplicity of responses and good ideas could then be considered in a second round, winnowed down, perhaps voted on and prioritized if there are numerous good suggestions about which ones to do first and how to do them, and so on.
This is not the time for that. We have a specific situation going on, or that has just gone on, that elicited a flurry of comment. The situation is now quiescent, but is lurking. Other than that, to my knowledge no one is currently thinking about the bot as far as expanding it or altering it. In accord with "The squeaky wheel gets the grease," we should now attend to the squeaky wheel, not survey the driver about the rattle in the dashboard, and whether this is a good time to reupholster the seats. If you are going to survey, I would eliminate all questions but one, and that one should be focused on the frequency question, and what should happen during a cold start or with new users. My fear is, that a broad survey would obscure and deprioritize an acute situation, in favor of open-ended discussion regarding aspects nobody is complaining about, take who knows how long, and defer action on the frequency issue, for new design and coding issues that are more fun. That would be unfortunate, imho.
Consequently, I would keep question #2, and drop the others for now. I think that question needs to be placed in context, to explain why you're asking. The first sentence sort of implies that, but you are so familiar with the code and the issue, it may make it easy to forget the fact that some people are blithely ignorant of what's going on and need a brief intro. I would add something like, "Currently, at start-up it is possible for a user to receive their entire monthly allotment within a few minutes, with updates to their Talk page every few seconds," perhaps adding, "This mostly applies to new users, or cold starts after the bot has been shut off for some time, as recently occurred." That would clarify why you need to ask this question at this time, and help them focus on the fact that this is a real issue, and not some sort of blue-sky ideal.
That is the squeaky wheel that needs fixing, so in my view, that is the only question that should be discussed at this time. Given that it's only one question, you could do it as an Rfc, rather than as an open-ended survey. I think the survey idea for broadly discussing the bot's functionality is great, and you should hold on to it and expose it when the waters are calm, but right now seems like the wrong time for that, in my opinion. Just my 2 cents, and thanks for asking! Mathglot (talk) 18:55, 21 June 2020 (UTC)
@Mathglot: Each of the questions on the survey is something that's been brought up, either here by you and David or by KarusuGamma on WT:FRS; people are thinking about expanding the bot, consequently I think it's probably better to ask these questions all at once, given that they are all known and clear at this point, because it means there's justification for sending a message to all FRS users about it, which would be a lot more difficult to do piecemeal; it also means that it's a whole lot easier to encourage people to respond, because they can immediately see how aspects of it would affect them. "The squeaky wheel" here, the issue with a cold start sending many messages, is already partially addressed by changes I've made earlier to the bot, and either way will not come up in the near future, as the bot is running every hour automatically on Toolforge; there's nothing urgent to address in that sense, or I would have patched it already.
I'm not sure I'd describe any of the aspects in the survey as "fun coding"; they're all relatively boring maintenance tasks. That's not the point of it, though; I didn't sign up to this to do "fun coding", I wanted (and still want!) to build something useful for the community. If your fear is I'm going to get sidetracked implementing new features - well, the extent of the new features even possible there is "add two new categories", one of which I've already actively opposed, so I feel that fear is somewhat unfounded
It is no longer possible for updates to their Talk page every few seconds to happen, as I have already implemented rate-limits on how fast the bot can send messages (I think I mentioned that somewhere above, but I wouldn't swear to it!). That being said, I'm happy to add (and have added) clarification on the point of the cold start - please do take a look and let me know if it reads well to you
David Tornheim, hope you're well - have you had a chance to look at the survey?
Cheers, Naypta ☺ | ✉ talk page | 22:36, 21 June 2020 (UTC)
@Naypta: Thanks for asking. Yes, I read it. I must confess that when I first read it a few days ago, I only read the first two questions and both seemed good. The intro and first two questions were pretty well written compared to what I expected and compared with many of the more complicated RfCs I have seen. I salute you for good writing skills--something many programmers lack. ;)
I had meant to get back to reviewing it, but when you stopped replying here, I was wondering if you were still listening. I'm glad we have your attention.
I do see both your perspectives about the scope of the survey. I wonder if there is some way to work out a plan that all of us thinks is productive.
The one point I do disagree with Mathglot about is that although I agree the cold-start is a problem, I feel that a number of us are concerned not just with the cold start but other ways that it functions differently than Legobot by undernotifying people.
I'm particularly concerned with item #3--with the potential for almost no notification at all at the end of the month. It seems that the probability of *any* notification going out should not be dependent on what day of the month someone submits the RfC, and that the number of notifications for any RfC should not significantly effected by the number of RfCs sent in the previous day or week. This to me is a very big problem that needs to be addressed. It will enable editors to game the system by submitting RfC's they want low feedback on at the end of the month or concurrently, and those who prefer more input to submit them at the start of the month or wait until there is a lull in RfC sumbissions--this is very undesirable in my opinion. It puts an unnecessary burden on RfC creates who want to use the RfC for what it is intended for--to get a lot of input from non-involved editors.
These two problems can be easily be resolved by changing the code IMHO and I believe a number of programmers have explained ways to avoid it.
I did come up with a way that would allow you to make the $rfc the outside loop rather than $user, but I have been waiting until you were replying back here before submitting it.
I did try to look at the code a couple of days ago. I will comment on that soon. The short version is: Could you please document it?--at the highest level describing exactly how it functions, on each page, explaining what the module does, what each subroutine is for, any important dependencies on other modules and what uses it, each subroutine in each modules, its inputs and outputs and every significant variable, especially any global variables (not things like indexes and temp variables). I don't even know where to look for routines called from main without searching every module or guessing, which routines are doing the input or output from the program. Sorry if I sound like your professor or TA. --David Tornheim (talk) 02:35, 22 June 2020 (UTC)
@David Tornheim: I fear that may be rather damning with faint praise, "your writing is good for a programmer" That notwithstanding, I've read every message posted here, and replied to everything that I think requires a reply from me; if you feel I've missed something, please let me know and I'll be happy to address it.
With respect to item 3, I agree that the issue you have raised is a problem; I disagree that "a number of programmers have explained ways to avoid it", primarily because I'm not entirely convinced that there is a solution, beyond the imperfect ones I've proposed on the survey. Exaggerating the problem, if you have 100 RfCs in the first week of a month, and two users who each have a limit of 50, you can do one of four things, as far as I can see: send the first 50 to both, and then send nothing for the next three weeks; send 50 to one and 50 to the other, and then send nothing for the next three weeks; miss out some of those RfCs in order to achieve a broader spread over the month; or disrespect their limits. None of those solutions are ideal, which is why I've proposed the point on the survey I have about queueing, to see whether that's an acceptable way of dealing with the problem for people. If you have a better solution to this fundamental problem in the logic of the system, rather than the functioning of the code, I'd love to hear it.
I'd like to state again, as I did the last time you mentioned loops, that the issue here very much isn't one of code; I could change the FRS to work the way that you want it to unilaterally in the code this morning. My point is that, because that's different to the way that it's documented (irrespective of whether, perhaps, Legobot didn't actually do what it was documented to do anyway) I think it should be something that happens as a result of an outcome from a user survey, not just a discussion here. I'm more than happy to receive constructive criticism on the codebase, obviously, but I think it is important that we are clear on when issues are related to the codebase, and when issues are in fact related to the specification.
Finally, as I'm sure you've seen already, the vast majority of methods do have a documentation comment; if your concern is code navigation, any reasonable code editor will permit you to ctrl-click on a method to see its definition, and on a method definition to see its usages. Personally, I developed the system in VS Code, but that's by no means saying you must or even should use that - and, if you don't fancy using any different software, GitHub itself will actually do code navigation automatically for you if you just click (not ctrl-click) the method names in the code. Hence, I don't think that needs separate documentation - but if you have any specific questions, please feel free to shoot them across. Naypta ☺ | ✉ talk page | 08:22, 22 June 2020 (UTC)
Naypta I believe the problem in 3 can be solved. I believe that the data analysis will help make that clear. I do have a proposal for how to do it. I will post that soon.
In regards to your desire to send out the survey, do you mind waiting while we continue to discuss it with Mathglot and whoever else wants to discuss it?
I'm hoping to propose a solution that will make questions #1 - #3 unnecessary or simpler. I think the problem is that you are focusing on making sure that every RfC or GA gets some arbitrary number of notifications (e.g. 5 to 15), when there could potentially be 0 users that want notifications of that kind of item or 30,000 users who do.
When the bot has reached steady state (i.e. running for one month or more), the amount of notification for any RfC should be approximately the (# of users who want notification in the category)*(average # of notifications each user wants per month) /(average # of RfCs per month in that category). I believe the notification per RfC will be approximately half that if the bot has stalled for one full month or more and rise to the full amount once one month has elapsed.
You seem to believe this number is always between 5 and 15, but you haven't provided any data analysis to establish that, and I believe the number is significantly higher than that. The appropriate number is highly dependent on the number of people who sign up to each category, the frequency of messages they want, and the actual number of RfCs that are coming in. Any of these numbers can change, and I believe your program should adapt to the changes to smooth out the notifications rather than work on some fixed and arbitrary number of notifications per RfC, or by letting the timing of each RfC having a big effect on the number of people who receive notification of it.
Maybe it does do that, but I have not seen evidence of your model that keeps track of the necessary data, and the cold start behavior seems to indicate it is not taking data analysis into account.
If done this way, the only reason that the bot would fail to notify anyone of a particular RfC in a particular category is because the users don't want to be notified, not because the RfC happened to come on the wrong day of the month or too close to some other RfCs. I'm pretty sure another editor was trying to explain this too. --David Tornheim (talk) 15:22, 22 June 2020 (UTC)
@David Tornheim: I welcome any specific proposals you can put forward to address this issue, and I look forward to reading them. I don't seem to believe anything specifically about the way that the selection limits work; the bot is set up to do that, on the basis that I considered it a conservative initial estimate, not on the basis that it would necessarily stay that way in perpetuity. I maintain, however, that whatever formula the bot uses, it cannot in any sense resolve a situation in which there is one user, with a limit of one, subscribed to a category that receives 5 RfCs beyond just ignoring some or queueing more. That is, fundamentally, the even more exaggerated version of the problem with which we find ourselves confronted here - unless we are talking at cross purposes, in which case there has been a miscommunication between the two of us on what we are discussing. Naypta ☺ | ✉ talk page | 15:41, 22 June 2020 (UTC)
Naypta that whatever formula the bot uses, it cannot in any sense resolve a situation in which there is one user, with a limit of one, subscribed to a category that receives 5 RfCs beyond just ignoring some or queueing more. This is where the data analysis is helpful. In the case you proposed the formula computes (1 user in category) * (1 notification per month) / (5 RfCs per month) = 0.2. What that says is that the rate the the user receives notification is 1/5 of the rate that the RfCs are coming, hence that user should only receive 1 of every 5 notifications. Of the 5 that come in for the month, the probability of receiving each should be identical = 0.2.
That's for the ideal case where the RfCs come in uniformly every 6 days. However, we know there will be bunching where, for example, there could be 9 RfCs in January, 1 RfC in February, 9 for March, 2 for April, 8 for May, or the average may increase over time. In these cases the probability the user should get February's RfC notification should be close to 1 and the probability for most of the RfCs in March closer to 0.11. (A cold start can be modeled as a bunching problem, where there are 0 for one month and then suddenly a huge number come in) The question is:
How should the bot adapt to dynamic swings in the rate of incoming data?
What I and the other person are saying is that the ideal is to use some kind of smoothing of the erratic variations of the input, so that the output is more uniform to the specification desired by the user of only getting N notifications per month. My instinct is that the best way to do it is with a Moving average, ideally an Exponential moving average (Exponential smoothing) or use of similar methods and calculations or smoothing functions from Queueing theory.
I was almost sure I saw another person mention Throttling process (computing) (or "data throttling", e.g. Bandwidth throttling), but I looked all over and can't find that comment.
Mathglot's comment here explains an easy way to address the problem of spikes in input data, although it does not specifically mention smoothing calculations that adapt to changes in the rate of data coming in. --David Tornheim (talk) 02:34, 23 June 2020 (UTC)
A binary signal, also known as a logic signal, is a digital signal with two distinguishable levels
P.S. I'm fairly confident certain ideas from signal processing can be borrowed by conceptualizing the RfCs as a digital "signal input" (where the input is the number of new RfCs since the last sample), and sending the input through a Recursive filter, Nonrecursive filter, Adaptive filter, or other Digital filter ([3]). If you haven't taken a signals and systems course, the concepts may be challenging. I have a feeling a simple Recursive filter could smooth out the signal. These filters are very easy to program; however, picking the best digital filter for the job, computing the ideal coefficients, or analyzing their behavior using the Z-transform is the hard part. An engineer working under time pressure would likely try out one or more of these filters and tweak the coefficients using sample data until the filter performed as desired for every important, unique and representative data configuration, and most likely skip any z-transform analysis--even if they worked for Intel. :) --David Tornheim (talk) 08:10, 23 June 2020 (UTC)
@David Tornheim: I'm not an electronic engineer, so digital signal processing is not a specialist area of mine; I understand the very basics of it, but not a great deal beyond that. That being said, without going into the specific details of how the smoothing might or might not work, I see a few conceptual issues with it.
  1. Assuming that the limiting system is still used in the same way, the proposed method still results in some RfCs being missed. In the example given, for instance, there is a 0.8 probability that any given RfC will not be sent to a user. If the limiting system was used as input for the probabilities, but did not in fact produce a "hard limit", that would be incredibly confusing for end users.
  2. The method still does not solve the problem you've raised of having a greater probability of messages being sent at the start of the month than the end. The cumulative probability of users having reached their limits will still be greater at the end of a calendar month than the start.
  3. The smoothing process and calculations involved prevent a lay-user from understanding the system easily. I don't have any specific evidence to back this up, but I strongly believe that systems that are effectively "black boxes" are less likely to attract signups, especially among the userbase of Wikipedia. The code may be open, but if a non-technical person cannot understand it, then it's fairly meaningless.
What may be an option to help address the actual issue you raised, which I'd not thought of before, would be to actually make the list expiry and regeneration system "at least one month". To expand slightly on that: rather than having each calendar month guaranteed to be a new set of users, have it so that it regenerates a list after at least one month, but if the list still has users on it who haven't reached their limit at the end of the month, continue sending to the current list until everyone is at limit. However, that could cause issues with a small number of users receiving a huge proportion of the messages, especially those who have limits of 100+, and a fair few of whom are inactive. Naypta ☺ | ✉ talk page | 09:11, 23 June 2020 (UTC)

Naypta I believe all of these problems are relatively easy to solve:

the proposed method still results in some RfCs being missed.

This is not a problem, but a consequence of the user indicating they don't want to be notified of every RfC, but only an average of N per month. If the user wants no more than 1 every month, but 5 come in every month, putting the 4 that don't go out in a queue won't help at all: 80% will expire before the user can be notified of them. It's like a bathtub where the drain is going at 20% the rate of the water coming in--the bathtub will overflow. It's impossible to notify them of all the RfCs. Instead one has to choose which ones they do and don't get. What I am discussing is how those can be chosen to avoid unnecessary biases. (Whether a queue would be necessary to notify users evenly is unclear at this point. I believe there are other ways to code it that can smooth the output similar to a queue that would not require one: For example, keeping a running list of every open RfC and every user who has been notified of it.)

If the limiting system was used as input for the probabilities, but did not in fact produce a "hard limit", that would be incredibly confusing for end users.

If properly coded, the data they receive could easily be restricted to a hard limit of no more than N per month. If notifying the user of a new RfC would cause the rate to exceed the maximum (e.g. more than N notifications in the last 30 days), then that notification cannot go out immediately. This is precisely what Mathglot was saying. Whether that notification should be delayed or never go out for that user would depend on the data. A proper analysis of old data either by hand or on the fly should be able to give a sense of what is best--there really is no need to ask this in a survey.

The method still does not solve the problem you've raised of having a greater probability of messages being sent at the start of the month than the end.

Actually it does. Hopefully the input flow does not depend on what day of the week or month, and neither should the output. Using a Moving average, including a Exponential moving average (Exponential smoothing), will not be effected by the day of the month. To achieve a hard limit, it would instead take note of how many times the user was notified in the last 30 days, rather than only looking back to the first of a calendar month. It might effectively look back even further to measure the long-term average of the rate of incoming RfCs (the recursive filter does that without needing to actually look back but by simply recalculating and/or shifting an array of 30 to 90 numbers).

The smoothing process and calculations involved prevent a lay-user from understanding the system easily.

Not true either. Cars, GPSes, health monitors, etc. routinely make calculations like Miles per hour, average speed, miles per gallon, miles before car runs out of gas, heart-rate, blood pressure, etc. using these methods. I have rarely heard customers complain that they did not understand the algorithm that gave the result. As long as it does what it is supposed to do and as they expect it to, they don't care. If it does not, then they complain, and ask that it is fixed, which is why we are here having these discussions: It is not performing the way people expect it to. I believe this can be solved without asking them questions. Mathglot and I have a good sense of what they expect and how to give them what the want, but for reasons I don't fully understand, you seem opposed. Perhaps, you don't understand how the proposed solution would work, and maybe we can work that out...

Because respondents may not understand the most basic components/elements of Systems design and software design such Queue_(abstract_data_type), Queueing theory, Stack_(abstract_data_type), LIFO, FIFO, or a Cache replacement policies, I think it would be better to discuss those kinds of things with people who understand them. I wouldn't oppose inviting more people with software expertise to the discussion.)

I strongly believe that systems that are effectively "black boxes" are less likely to attract signups

I think we all agree that a "black box", especially code that is not public, is undesirable. But right now, because your code is undocumented and it seems likely that no one tried to review it--which is why no one caught the bug of cold-start--it is currently a "black box", and I don't think it is clear to anyone--including me--exactly how it works, how it chooses who to notify, because you don't have documentation for the overall program, and no real documentation for the code and its modules, except for notes to yourself.

The methods Mathglot and I are proposing can be easily documented at the customer (user) level in language ordinary humans can understand in terms of functionality -and- at a more sophisticated level for those who can read more technical specs. The code could be carefully and accurately documented to explain exactly what it is calculating and how and why those calculations will give the desired result described in the functional definition.

What may be an option to help address the actual issue you raised,...

I'm not convinced that would address the main issue. Your suggested solution treats the start of the month and end of the month differently than every other day of the month. I consider that a bug.

Also, I am increasingly convinced you are not taking into account the rate of input vs. rate of output. Without that analysis either performed before the program is run, or better on-the-fly as data comes in, I believe it's not going to work as users would expect it to, and you will continue to receive complaints until it does. The complaints may not be immediate but when editors see something weird, like the cold-start, and getting clumps of notifications followed by no notifications at all. --David Tornheim (talk) 11:49, 23 June 2020 (UTC) [copy edit 20:54, 24 June 2020 (UTC)]

@David Tornheim: To briefly address the specific point about moving averages: this being a volunteer project, number of edits fluctuates with incredible frequency. It may be possible to produce an appropriate model, but it would by no means be a perfect predictor, and I'm unconvinced it would be much better than pure random chance - considering the really quite significant amount of time and effort that would need to be put in to produce such a model, I am unconvinced that that tradeoff is worth it. Either way, what you are proposing here is something different to the actual specification for what the FRS is, as I have pointed out a number of times now. I'm happy to have the conversation about whether that's a good thing or not, but I think it's worth bearing several things in mind about that.
Very few people are still talking about this, so saying "you will continue to receive complaints until it does" isn't accurate, I don't feel. That's not trying to discredit what you're saying, but it is a recognition of the fact that this is an issue that a infinitesimally small number of users are still concerned over.
The cold start problem no longer exists in the same way, because, as I mentioned many paragraphs ago, it was primarily caused by a shared random seed for the run - meaning some users got a huge amount of RfC invites, whilst others got none. That random seed is now regenerated on each of the message rounds, meaning that there would no longer be the same sort of deluge in practice, even if it is theoretically possible.
I'm sure this isn't what you mean to convey, but reading your messages, what you're saying comes across to me as being quite rude - perhaps that's because I'm a Brit, in fairness, we're not very good at direct confrontation! That notwithstanding, in particular, Mathglot and I have a good sense of what they expect and how to give them what the want, but for reasons I don't fully understand, you seem opposed. Perhaps, you don't understand how the proposed solution would work, and maybe we can work that out... I take issue with in particular, which I read as being quite insulting.
Every question you have asked me, I have answered, with links to the relevant parts of the code, explaining even what other people's libraries used in the code do; if you feel it would be helpful, I am happy even to produce a separate high-level documentation document for the bot's code. This is, as you are now aware, all far beyond the requirements for a Wikipedia bot; I'm choosing to subject myself to a significantly higher level of scrutiny than is required. I appreciate that this is the first time you've been involved with bots on Wikipedia, but I'd encourage you to consider this sort of thing is why some bot developers choose to keep their code private. I could have not published the code, said "all fixed" at the start of this whole conversation, and from the lack of other complaints, it looks like just nothing further would have happened; I didn't do that, because I want to be open to constructive criticism, and I want to be open to improvements, but both of those things need to come from a constructive, respectful conversation.
Both of us are volunteers, giving up our free time to do this; there is no obligation placed on anyone on this project to edit, build bots, or do anything of the sort. I am, as I have said, by all means open to constructive criticism of my work, and suggestions for improvements, but I feel this is departing from "this could be an improvement" more into the realm of "this is totally unacceptable to continue running without this specific change". If that's how you feel, that's fine, but you should raise that point at WP:BOTN if you feel that way, and if there's consensus that agrees with you, the bot can be stopped until such a change is made; nobody else has said anything of the sort. Naypta ☺ | ✉ talk page | 12:21, 23 June 2020 (UTC)
@Naypta: I decided to give this discussion a cooling off period, since you seem to have been upset by it. I have not forgotten about it at all or lost interest in how this extremely important service functions. I am still interested in understanding how your code works, and it is 10x harder to understand without documentation. I believe you made an offer somewhere above to document it. I would very much appreciate that. Right now, I am interested in the code that handles the incoming data:
(1) How you parse the FRS page? I believe some of the code may be in yapperbot-frs/tree/master/src/frslist, but it's not clear what the various subroutines do.
(2) What is the data structure that stores user data found on the the Wiki FRS page? Where is it defined? Which modules have access to that data and use it? Is it global?
(3) How do you parse the page(s) relating to current or new RfCs? What page do you parse that lists the open RfCs? How do you determine which categories they belong in?
I do see that on line 30 of rfc.go you have defined a data structure regarding each RfC.
However, I'm not sure where you instantiate that structure. Is there a global array, linked list, hash, or other variable to store the data for each RfC? Or is there just one instance of it?
I'm considering creating a page here on Wikipedia, perhaps Yapperbot/Documentation and/or Yapperbot/CodeDocumentation to document what I have found. It would be a lot easier if you helped with that rather than leaving me to reverse engineer it.
I'm also curious about your plans for the survey. We (you, me, and Mathglot) obviously disagree about the purpose and goals. The survey as it is written is unable to allow users to consider and vote in favor of the functionality that I (and likely also Mathglot) believe it should have. We seem to disagree on what the functional definition actually is: You say it meets the spec, and Mathglot and I believe it does not and has undesirable defects.
I believe with some cooperation the survey could be changed so as to ask users the appropriate questions about what the functional spec is and how it should be interpreted; however, I don't want you to feel like I am insulting you by trying to cooperate with you on it, so that it addresses our concerns. If you are unwilling to accommodate my concerns in the survey, an alternative is that I could launch my own RfC about it, and use my own questions that would be binding on the functional definition of the FRS. I think it would be better to work together on this rather than have two different surveys/RfCs. --David Tornheim (talk) 08:35, 3 July 2020 (UTC)
@David Tornheim: I do not feel you are trying to insult me by cooperating with me on it; I have tried to make it as clear as I can that I absolutely welcome feedback and constructive criticism. I pulled you up on the Perhaps, you don't understand comment, because that comment in particular came across as rude. Let's move on, though.
The FRS page is called for a parse in main.go on line 49, which calls Populate() in the frslist package. That function then calls populateFrsList(), which fetches the wikitext of WP:FRS, and then loops over each regex match of the listParserRegex. For each match, it constructs an FRSUser object dependent on the limits that the user has set. Those FRSUsers are then stored in a map[string][]FRSUser - that is to say, a map of string keys and arrays of FRSUsers as values - which is scoped specifically to the frslist package.
Open RfCs are queried through the main.go call to queryCategory, which then queries for locations where the {{frs}} template is transcluded. main.go then calls to extractRfcs from the pages it finds, which regex-matches every open RfC, its categories, and its ID. For each of those RfCs, the code then checks if it is already done, and if it isn't, but it has an RfC ID set, then it runs a request for feedback.
With the knowledge that main() in main.go is the entry point, code navigation on GitHub should allow you to read through the entire execution of the program, though. I've composed each of those links just from stepping through GitHub's code navigation, clicking on functions and using their "Definition" and "References" tabs as appropriate.
I have written an extra question into the survey which tries to address the specific question you are bringing up; please feel free to correct it if you feel it does not. Let me know what you think. Naypta ☺ | ✉ talk page | 11:27, 3 July 2020 (UTC)
Naypta Thanks for the response. The answers will hopefully help some. Unfortunately, the survey question does not really address my concern. I'm thinking about how to rewrite. I created a documentation page here: User:Yapperbot/Documentation --David Tornheim (talk) 17:59, 3 July 2020 (UTC)

Documentation

I created a documentation page here: User:Yapperbot/Documentation --David Tornheim (talk) 18:00, 3 July 2020 (UTC)

My understanding is that the code is here: https://github.com/mashedkeyboard/yapperbot-frs

Tools are here: https://github.com/mashedkeyboard/ybtools

Above I request:

Could you please document [the code]?--at the highest level describing exactly how it functions, on each page, explaining what the module does, what each subroutine is for, any important dependencies on other modules and what uses it, each subroutine in each module[], its inputs and outputs and every significant variable, especially any global variables (not things like indexes and temp variables). I don't even know where to look for routines called from main without searching every module or guessing, which routines are doing the input or output from the program. Sorry if I sound like your professor or TA. --David Tornheim (talk) 02:35, 22 June 2020 (UTC)

Above you (Naypta) reply, as I'm sure you've seen already, the vast majority of methods do have a documentation comment

Actually, no I don't.

Are you telling me that main.go is documented?

The only comments other than the GNU header are:

4: // Yapperbot-FRS, the Feedback Request Service bot for Wikipedia
67: // gets a list of all active RfCs. We'll manage which ones to deal with later
80: // Set our runfile to store this now, as there's potentially going to be nothing in the queue
95: // give it at least an hour of tranquility before invites go out
96: // this is gcmend not gcmstart as it's going down from the most recent
105: // There seems to be no guarantee that the value of pages will be ordered, in any way.
108: // on the first item of the entire set, and the first item ONLY, save the timestamp and the page id into a var to write to runfile later
109: // RfCs don't use this as they're given IDs and don't need it
111: // because pages is unordered, we have to make a separate, limited request for this
112: // this gets us the first (latest) item
133: // Remember to do this! Golang by default turns integers just into the
134: // corresponding unicode sequence with string(n) - e.g. string(5)
135: // returns "\x05"
147: // format it into a string integer
162: // (content, title, excludeDone)
186: // Because each article can only have one GA nomination at a time, it's not necessary to do the full gamut of RfC checks here
187: // we can instead just pass it on to requestFeedbackFor after checking that it's not the same page we did first last time
188: // to do that check, we check whether the page ID and timestamp are the same (both stored in the runfile) - if they are, it's the same page
190: // it's the first page from last time, we're probably at the end - skip over it
198: // if it's a new file and no pages are picked up, just create the runfile so future runs will know where to start from
209: // If it uses a runfile, and there actually is something to write
211: // Store the done timestamp and page id into the runfile for next use

None of this really explains what the program does, what the input and the output is, the main variables, or how it works.

With just these comments, I have countless questions: (1) What do all these modules do?

"yapperbot-frs/src/frslist"
"yapperbot-frs/src/ga"
"yapperbot-frs/src/rfc"
"yapperbot-frs/src/yapperconfig"
"cgt.name/pkg/go-mwclient"
"cgt.name/pkg/go-mwclient/params"
"github.com/mashedkeyboard/ybtools/v2"
"github.com/metal3d/go-slugify"

(2) What is being initialized here?

func init() {
ybtools.SetupBot(ybtools.BotSettings{TaskName: "FRS", BotUser: "Yapperbot"})
ybtools.ParseTaskConfig(&yapperconfig.Config)
}

(3) What is stored in the variable "w"? When do you plan to use it? What will the client do?

w := ybtools.CreateAndAuthenticateClient(ybtools.DefaultMaxlag)

(4) What does this routine do? What is the "frslist"? Where is it stored? Is it a global variable? What kind of data-structure is it?

frslist.Populate()

I am not sure how I am supposed to answer such basic questions based on the comments above.

Mathglot Am I missing something here? --David Tornheim (talk) 12:33, 22 June 2020 (UTC)

@David Tornheim: I fear that the problem here is a mix of lack of familiarity with Go, and perhaps a misunderstanding on my part as to what it was which you were trying to do; I had not realised you were trying to read the code line-by-line, rather than just getting a feel for what it does. Nevertheless, I'm more than happy to take you through each of these points!
With regard to the modules:
  • frslist, ga and rfc are just separate processing modules of the Yapperbot-FRS code. They're parts I've split off for ease of maintenance and clarity of usage; there's nothing hugely special about them. yapperconfig is similar; it is primarily there to hold a configuration object which is shared between the main package, and each of the submodules mentioned above. As I noted above, as each of these modules are included in this repository, you can navigate through all of their usages by simply clicking on the function name, whereupon GitHub will show you the definition and other usages.
  • go-mwclient and go-mwclient/params is the library I have chosen to use for interacting with the MediaWiki API. All external libraries in Go are URLs, so if you want more information on any of them, you can just go to the URL and see the code
  • ybtools is my shared library of code across Yapperbot tasks; the FRS is not the only thing Yapperbot does, so to avoid writing the same code again and again, I have a repository of shared code between the tasks.
  • go-slugify is a very simple library for turning "Humanized strings of all sorts!" into "slug-strings-that-can-be-filenames".
In the initialisation function, my library, ybtools, is having its components configured correctly, so that it can be easily used without repeating the bot name and username. This is used for things like configuration file naming, determining whether or not a bot is allowed to edit a page (see {{nobots}} and {{bots}}), and for sending me appropriate errors if something goes wrong. It's also used to initialise the authentication process for CreateAndAuthenticateClient.
CreateAndAuthenticateClient, along with other ybtools methods, has documentation automatically generated from its comments on pkg.go.dev. w is a pointer to a mwclient.Client, which handles interactions with the MediaWiki API. Any calls to the wiki itself that aren't handled automatically by ybtools would take place through methods on the Client. Once again, as with all golang libraries, the package has documentation on pkg.go.dev.
frslist is an imported module, as you highlighted in the module list you posted above. Its role is handling the WP:FRS page itself - parsing it, storing its data, and updating it as necessary. Populate(), as described here, "sets up the FRSList list as appropriate for the start of the program". This means loading in the WP:FRS data, performed here, and storing that in an array of FRSUsers, as well as populating the counts of messages already sent this month from User:Yapperbot/FRS/SentCount.json.
I hope that's helpful! Naypta ☺ | ✉ talk page | 12:53, 22 June 2020 (UTC)
@Naypta: Yes. That will get me started. Thanks. I need to spend a little more time figuring out features unique to Go, jump around the code to get sense of where everything is that is important, and probably ask some more questions. I might create a separate page that documents the code as I understand it, and you can correct me. --David Tornheim (talk) 14:13, 22 June 2020 (UTC)