Talk:Backup/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2

Old comments

There is a new article at Data backup, which may need to merged in this one. I left a note on the author's talk page, and recommended either merging it here or changing its focus enough to make a standalone article. 68.81.231.127 09:42, 18 Dec 2004 (UTC)

"A backup should not use compression. Compression reduces data redundancy. Redundancy might be useful when restoring data from damaged media." Does this apply to CD-R–based backups? —Wins oddf

I've just rephrased that to be a bit more straightforward, but yes, regardless of media, it still applies. If you scratch the CD and the data on it is uncompressed, you'll lose part of a file or so. If you scratch it and the data on it is compressed you may lose the entire compressed archive. — mendel 20:43, 14 September 2005 (UTC)

Paris bank fire

Talk copied from wikipedia:Reference desk/Miscellaneous Jay 08:00, 7 October 2005 (UTC)

Which is the bank mentioned in the Backup article : "A few years earlier (to 2001), during a fire at the headquarters of a major bank in Paris, system administrators ran into the burning building to rescue backup tapes because they didn't have offsite copies." ? Jay 14:01, 22 September 2005 (UTC)

In the last decade there have been two Paris bank fires. The Credit Lyonnais headquarters in 1996. And a fire at Banque de France in 1999, which I don't think anyone cared about. I can't verify the above anecdote. lots of issues | leave me a message 05:05, 23 September 2005 (UTC)
Confirmed that the headquarters of the Lyonnais burnt on May 5, 2001, apparently arson. It is rumoured [1] that the disappearances of archives was intentional (the Lyonnais was, at the time, caught up in major scandals). The 1996 fire seems to be cited as an example of what should not be done on sites promoting data backups. David.Monniaux 16:18, 26 September 2005 (UTC)
Yes, Credit Lyonnais seems to be the one since it gave many Google hits on "fire" and "Credit Lyonnais". Got a case study for data backup using this fire incident as an example. Jay 07:55, 23 September 2005 (UTC)

The source that is referenced to support the claim about IT admins running into the burning building to rescue the tapes is unreliable. It looks like something copied from some kind of usenet group, and the author of that source even admits that everything in his post is unconfirmed. The citation I'm referring to is number 16 in the endnotes. Schmungles (talk) 01:18, 9 September 2009 (UTC)

Verb versus Noun

I have often seen the verb form written as two words "to back up the system", whereas the noun is always one. Not sure if this is worth mentioning in the article.

212.32.67.111 16:10, 25 December 2005 (UTC)

Differential backup and Types of backup changes

A few changes I just made to the article: first the references to the "archive bit" in the "Types of backup" section have been removed - the archive bit is a system-specific feature of WinNT systems and is not relevant to backups in general (besides, it tends not to be used by most modern backup software, they keep their own info on what has been backed up). Secondly, i have limited the description of restoring using differential backups to the case (Full backup + differential backup) - there had been mentions of incrementals as well, which for the purposes of exposition had to be dropped - someone not familiar with the terms will be confused as to whether incrementals taken before the differential are required, whereas the author (I assume) meant any incrementals newer than the differential. For clarity, keep it simple, then the reader can grasp the concepts and the more complex cases will suggest themselves without clouding the descriptions. -- unsigned comment made by Special:Contributions/88.107.68.168 at 23:19, 26 August 2006

Formats

"A backup should rely on standard, well-established formats."

Such as?... — Omegatron 00:50, 29 January 2006 (UTC)

On unix and most NAS - tar, gtar, cpio, dump, On netware - sidf On Windows - MTF (Microsoft Tape Format), Vendor Specific - OTF ("Open" Tape Format - Depends on your definition of open, cf EMC Networker man pages), Netbackup multiplexed gtar, various Vendor specific implementations of MTF, and whatever TSM uses.

Thats the great thing about standards, there are so many to choose from.

--Sharkspear 00:37, 4 January 2007 (UTC)

Incrementals

This article is wrong. The discussion of differential versus incremental needs to be corrected. It is fuzzy and misleading as it is now. I suspect the author doesn't really understand the concepts properly. Some examples of backup/restore strategies using "1) full backup + Incremental" and "2) Full backup + differential" bringing out the Pros and Cons of each is also necessary. 82.46.191.5 09:47, 25 December 2006 (UTC)

Which part of the article is wrong? There is no explicit discussion of differential versus incremental. The only mention of differential backups is in the Glossary section and each of those entries looks good to me. I suggest putting the details of incremental vs. differential in the Incremental backup article. My opinion is that differentials are rarely used and that writing a lot about them in the general backup article would only serve to obfuscate the larger issues. -- Austin Murphy 15:46, 27 December 2006 (UTC)

Storage Media

With regard to the Optical Media section, is "(This is equivalent to 12,000 images or 200,000 pages of text.)" really true? It depends greatly on formats, quality, compression etc.

Also I was considering changing the format of this section to an "Advantages/Disadvantages" format. Ozstrike 01:34, 9 January 2007 (UTC)

I agree that the 12,000/200,000 comment is basically baloney and suggest removing it. As to changing it the whole section, I would mostly be interested in including the characteristic features of each medium rather than turning it into a face-off of sorts. Feel free to contribute! -- Austin Murphy 01:57, 11 January 2007 (UTC)

this is a back up —Preceding unsigned comment added by 122.161.83.133 (talk) 07:53, 31 March 2008 (UTC)

Link spam for Veritas NetBackup

I see a lot of proprietary language on this article based on Veritas NetBackup. Is this a really, really hot product or does this reek as much as I think? Marc W. Abel 16:44, 19 April 2007 (UTC)

Hi Marc, I just delinked NetBackup from two of the glossary entries. The point in mentioning NetBackup by name 5 times in the Glossary is to make sense of some of the unique terms used for different functions. NetBackup is one of the Big Three commercial unix backup packages. The other two are Tivoli and Legato. BackupExec is pretty big on Windows. I don't know the terminology for them so I didn't include it. If you have other terms that could be added, please do. -- Austin Murphy 18:33, 19 April 2007 (UTC)

Grammar

The grammar in the 'backup' article appears to be erroraneous in a few places. Take for example the opening line: "backup refers to the copying of data so that these additional copies may be restored after a data loss event." I have issues, particularly with the text "restored after a data loss event". From what I understand, "data loss" is the result of failure (software/hardware, electrical infrastructure anomaly, fire, etc), while the "event" refers to the instance in which the said data loss occurred. I suggest rephrasing the text like so: "backup refers to the copying of data onto supplementary media and facilitate data restoration in the event of a failure leading to data loss." (66.41.51.46 01:37, 12 May 2007 (UTC))

I agree the language was a bit clumsy, so I've updated it. However, I don't think "supplimentary media" really makes any more sense than the previous wording. Check out the data loss page. There is more to data loss than failures. Time playes a crucial role in how data is handled. I think the phrase "data loss event" accurately conveys this. -- Austin Murphy 15:21, 14 May 2007 (UTC)

Failed GA

This article has failed the GA noms due to being written like a list, as well as the few amounts of jargon in various places. If you disagree with this decision feel free to take it to WP:GA/R. Tarrettalk 20:53, 10 September 2007 (UTC)

Too much like a list?

The GA criticism seems to indicate that a way to improve the article would be to make it less like a list. Wikipedia:Embedded list and Wikipedia:Lists have a bit of official info on the subject. I think they generally support the way the article is laid out. Still, it may work better if there was more prose. Comments? -- Austin Murphy 15:02, 22 October 2007 (UTC)

A few points based on experience..

I often ask myself why many (if not most) users fail to maintain any sort of backups, and the main reason is most likely the massive level of complexity of the commercially-available backup systems. Complexity which for the most part is totally unneeded. A key example is the 'media pooling' regime of Windows servers, which through its complexity and troublesomeness is a very frequent cause of backup failure.

It would be beneficial to explain in straightforward terms WHY rotational backups are needed; most users do not grasp the fact that repeatedly using the same media only protects against losses which are immediately noticed. A few pictorial examples of rotation might be helpful in explaining the principle.

Another point worth touching on is that many proffered backup 'solutions' are OK for backing-up documents, but woefully inadequate when it comes to a hard-disk failure, in that they are incapable of fully restoring the OS or system-partition from a backup.

The point about standard formats is a good one, and to this I would add that a backup is of little use for disaster-recovery unless the format in which it was made, and the disk-partitions it represents, are documented. As is a copy of the backup software itself, especially if this is proprietary.

Perhaps the point about verification could be made more strongly, in that many backup systems are notorious for failing to notify the operator that they have started to 'write blanks' and will continue to do so indefinitely unless a periodic manual check is made that the backup actually contains data.

Finally, it might be relevant to mention that since backup processes typically run under a specific useraccount (e.g. root or Administrator) a frequent pitfall is that of forcing a change of this account's password as part of a security-policy, and thereby knocking-out the backup. Since this typically also knocks-out any error-notification process, the fact may go unnoticed until a data-loss occurs. --Anteaus (talk) 11:00, 6 December 2007 (UTC)

Hi Anteaus, I'm not exactly sure what you're getting at here, but you are welcome to make the edits yourself. Wikipedia's guideline on this is called "Be BOLD!" If you would like to "test-drive" some edits, you can leave them here for some feedback. Also, consider that this is an encyclopedia, not a how-to manual, and it is directed toward a general audience. Deep levels of detail are welcome, but they must fit into the context. -- Austin Murphy (talk) 16:44, 6 December 2007 (UTC)

backup window

The backup window is not necessarily the same as doing a cold backup of a database or application. Fuzzy backups are a risk when doing hot backups or open file backups improperly. Cold backups require a strict backup window, but the term backup window is more broad than just that. -- Austin Murphy (talk) 19:08, 12 April 2008 (UTC)

Well, I have different understanding of the term, but I won't argue. But could you at least mention fuzzy backup somewhere in the article? Do you feel this is non-issue? --Kubanczyk (talk) 07:43, 14 April 2008 (UTC)
Sorry, I got distracted after that edit and forgot to move fuzzy backup to where I thought it fit. Open file backup is an important topic. I'm thinking of starting a new page to better describe the process for different types of data and the problem of getting fuzzy backups. -- Austin Murphy (talk) 18:14, 15 April 2008 (UTC)
OK, great work. I also feel that open file backup is a often overlooked major practical issue. Maybe it would be better to generalize Fuzzy backup article, as opposed to creating a new one? I think it's a good starting point. --Kubanczyk (talk) 20:09, 15 April 2008 (UTC)

truncating the introduction as suggested

--ruth_williams_aug20something--08--

In information technology, backups are typically to avoid loss by creating copies that can be used to restore original data after disasters (called disaster recovery), accidents or corrupted disc. ] Backup storage devices have evolved to concentrate on geographic redundancy, data security, and portability. Techniques have developed to allow optimal techniques regarding for example ... open files; live data sources; compression, encryption and de-duplication. Procedures are still evolving. Backups and backup systems differ from archives and archival systems in the sense that archives are the primary copy of data, typically kept as a historical reference and for future use, and backups are a secondary copy to guarantee replacement in case of loss. Backup systems differ from fault-tolerant systems in the sense that backup systems assume that a fault will cause a data loss event and fault-tolerant systems assume a fault will not.

--shorter-and-more-like-an-introduction---- —Preceding unsigned comment added by 125.255.10.253 (talk) 12:48, 25 August 2008 (UTC)

Hi 125.255.10.253, WP:LEAD suggests that a long article like this one should have 3 or 4 paragraphs in the lead section. It also suggests that "The lead should be able to stand alone as a concise overview of the article." I think that the existing lead section meets these goals. I'm not sure that your suggested text does that. -- Austin Murphy (talk) 14:30, 26 August 2008 (UTC)

Binary differential method

--narax 26/08/2008----

In "Selection and extraction of file data" add "Byte level Incremental or Differential" —Preceding unsigned comment added by Narax (talkcontribs) 12:16, 26 August 2008 (UTC)

In "Selection and extraction of file data" add "Byte level Incremental or Differential" -- Nara Moreno (talk) 16:30 26 August 2008

manipulation

Hi Mike A Quinn, Here's why I reverted some of your recent changes.

De-dupe is performed long after the backup software decides what data should be backed up. Like compression and encryption, it is just an alternate way to represent the same data on the storage media. I think that fits better into the manipulation section.

Staging is a little more complicated because in one sense, it is a temporary storage spot for data and in another sense it is the combination of on-line and near-line media management methods. I've added some text to this effect to the managing the data repository section. I think holding a temporary copy of data qualifies as manipulation. The individual backup datasets don't get modified in staging, but the way they are clustered on the final destination media can be changed dramatically.

-- Austin Murphy (talk) 21:52, 2 December 2008 (UTC)

Hi Austin,

Thanks for opening the talk topic, I agree with your comments to a certain extent, however the wiki article is entitled 'backup' and not 'backup software'. A general concern I have with the backup article is the manipulation section. During backup ALL data is manipulated to a certain extent as it flows from one medium to another however I think it is acceptable to keep encryption, compression, duplication and refactoring as these tasks actually transform (manipulate) the data being backed up into a different or separate form.

Deduplication

Does not rely on 'backup software' to perform a backup although the process of deduplicating backup data can be performed in association with a backup software. I do agree that Deduplication manipulates data to a certain extent as it leaves a pointer within the data set that points to the location of the unique file. However I would suggest that leaving a pointer manipultes data to about the same extent as the OS manipulates data when it changes the file status after a backup has been performed. Although deduplication is an alternate way to represent the data it selects data to store and that selected data-although most deduplication technologies are propriatory- is still in essence the same data as resides on the primary storage device and could in theory be directly accessible by the OS. To this extent backup software manipulates data to a far greater extent -as generally the data is not directly accessible by the OS once backed up- and so backup software might need to reside under the manipulation heading too.

Staging

Data is staged almost constantly throughout the backup process as it flows through one component to another and so therefore all backup data could be observed as being manipulated. There is however a section in this article that describes data repositories and when data is staged it is staged in a data repository awaiting its final destination, just like water is staged in a reservoir. You are correct that data is not manipulated on the staged media but it is arranged clustered more efficiently on the storage media. Arranging and clustering data more efficiently on a backup media is what backup does, all selection options of a backup assist in the clustering and arranging of data to store it more efficiently. If staging is a manipulation of data then so should differential and incremental backup selections be considered as manipulation.

I do not intend to revert the article page again at this point in time but would value further input from yourself or any other interested party and maybe we could come up with sections in the article that includes and covers our POV's.

sorry I forgot to sign above Mike (talk) 10:47, 3 December 2008 (UTC)
Hi Mike, I appreciate your comments and interest in the subject. Let me try to explain what I was thinking when I organized this article. In section 1 I tried to have a high-level view of all the different kinds of backup architectures. In section 2 I tried to describe all the different techniques that are necessary or useful for backups. Section 3 was supposed to cover all the planning and policy issues and section 4 is stuff that seems related but I can't really fit in somewhere else. I had a hard time thinking of proper names for the sections and subsections and I'm pretty sure they can be improved. I'm sure there are many other ways to improve the article too and I welcome your input in such.
I don't follow your argument about dedupe. Dedupe can dramatically change the storage requirements for backups. In this context, it is just a fancy way to compress data that is especially useful for backups. I think replacing a large dataset with a bunch of pointers is a pretty significant type of manipulation. Most commercial dedupe products are either virtual tape style or a NAS box or some type. These kinds of products are generally implemented as black boxes and have absolutely no say over what data gets copied from the computers getting backed up. They just accept the data that is sent to it and replace most of it with pointers. Some software like Rsync or BackupPC can do tricks with Hard Links that might cross over into the selection field, but I think that it still makes sense to separate the two concepts of selection and manipulation.
Staging is not quite as clear to me. D2D2T is/was a big buzzword and my thinking was that it ought to be covered somewhere. As you point out though, staging is not strictly a data transformation technique. Neither is it a unique architecture. It is more of an optimization technique that is employed to make the overall scheme more effective. The same holds true for duplication, refactoring, and multiplexing. I've updated the section name to be more general, but I'm open to other ideas.
--Austin Murphy (talk) 16:07, 3 December 2008 (UTC)

Hi Austin, Hey I just skimmed through the history of the page and noticed all your input to this article. Overall I think it is a good article. I have just added one word to the section name as we both seem to agree on what happens to the data. Mike (talk) 17:04, 3 December 2008 (UTC)

References and citations

References (11&12) from the two topics - Cold & Hot Data Backup are leading nowhere. Please suggest an alternative. Lakshmi VB Narsimhan 08:50, 19 August 2009 (UTC) —Preceding unsigned comment added by Lakshmin (talkcontribs)

The sentence "66% of internet users have suffered from serious data loss." should be removed or needs a good citation. The current citation (http://www.kabooza.com/globalsurvey.html) is based on "4257 respondents from 129 countries". This is by no means representative for the hundreds of millions of Internet users. Not to mention that the survey is presented by a company that needs to sell its products and might easily have faked the numbers. Martin Zuther (talk) 17:30, 21 February 2010 (UTC)

The Monitored Backup item is poorly written and needs cleanup or removal. --AlastairIrvine (talk) 10:48, 10 May 2010 (UTC)

where is discussion of archival issues like read-only retention time?

I was trying to figure out how to archive ( copy once and hide somewhere) some data I want to keep forever but rarely use. I wanted to safest media possible but couldn;t find a discussion right away. Now obviously these things change with tehcnology but an article on archival media with refs to current and historical datasheets and real world tests would be quite helpful and it would describe a notable feature of this topic. Finally under flash discussion I would some discussions on thermal decay and energy barriers but no tables comparing current or past flash devices to hard disks or optical media claims/measured. Thanks. Nerdseeksblonde (talk) 17:56, 21 July 2010 (UTC)

HDD stability unknown ?

"The main disadvantages of hard disk backups are (...) that their stability over periods of years is a relative unknown."

This is blatantly false. HDDs can last for very long times, if treated right - I have several working HDDs which are 10 - 16 years old, respectively. On the other hand I do also have newer models, between 2 and 8 years old, which crashed, mostly without or just very little prior warning. So the most important thing to keep in mind when working with HDDs for back-ups is that you never know if or when an HDD is going to die. It will happen all of a sudden, and you can't do anythign about it. So always keep at least two backups on identical HDDs. -- Alexey Topol (talk) 01:03, 17 November 2010 (UTC)

Backup or Archive or both!

I believe the wording you have for your 'backup' article is misleading! The term 'backup' is a generic term and you've gone into specifics of certain types of backups. If I have a backup running each day which overwrites the previous days backup, then introduces a corrupt file within this backup, you have no way of recovering the original file! (or a good copy) However, I still have what is known as a 'backup'! (even if it is corrupt). The description you give for a backup would lead me to believe I could still recover my original 'good' copy of this. For this to happen, I would need a backup and archiving solution. Would you agree? If my backup solution were to somehow create a 'new' copy of the data (in whatever form) and retain the original document and/or any changes, then surely this is an 'archive'! Depending on whatever solution one chooses to use for their so-called 'backup', they might not have the full capabilities in which you suggest they can. If you agree with this, then would you kindly amend your description please? DiveO2 (talk) 11:05, 6 February 2012 (UTC)DiveO2 - 6th Feb 2012.

What's with the "Law" section?

The so-called "Law" section in this article seems to me like it has nothing to do with law at all. It talks about confusion of terminology, then "Advice" for backing stuff up (which actually seems way out of place even if it wasn't under a "Law" section), then events related to backups. 75.45.185.124 (talk) 14:32, 23 March 2012 (UTC)

A different formatting on title? so ease the reading on Windows Phone or smart phone

I found some titles mess up with the text when reading on Windows Phone or smart phone, e.g. 1.3 -> Backup site or disaster recovery center (DR center); 2.3 Cold database backup; Hot database backup; etc. — Preceding unsigned comment added by Kmchanw (talkcontribs) 22:22, 3 November 2012 (UTC)

Requested move 4 August 2016

The following is a closed discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. Editors desiring to contest the closing decision should consider a move review. No further edits should be made to this section.

The result of the move request was: Consensus seems to be Not moved, opposes not countered. (non-admin closure) — Andy W. (talk ·ctb) 02:01, 12 August 2016 (UTC)


– Clearly, data backup is an important meaning of "backup", but even within computing it isn't the only meaning. There are backup batteries and other backup components ensuring Fault tolerance. Besides computing, almost every critical device in engineering, may have backup devices and/or procedures, be it in electric power generation, transmission, and sometimes distribution, in aerospace, rail transportation, or nuclear technology. Companies or state agencies have backup personnel available for certain types of events, the military has backup facilities, and large companies operate backup sites as part of their disaster response plans. Even having omitted quite a number of important applications, it seems obvious that data backup isn't a clear-cut WP:PRIMARYTOPIC.
But even beyond a mere WP:NATURALDISambiguation case, the current name "backup" doesn't even seem preferable to "data backup", the latter being a widely used, clear and concise alternative. It also helps bringing the associated category, currently named Category:Computer backup, in line with its main article. Renaming the category to Category:Backup would be impossible, as per common practice, categories are expected to be slightly more unmistakable than articles may be. It may however follow a rename of its main article to "Data backup". -- PanchoS (talk) 09:45, 4 August 2016 (UTC)

  • Oppose both. I have written "backup" in Wikipedia a thousand times, never even once "data backup". And in all those instances, I meant "the process of backing up, which refers to the copying and archiving of computer data so it may be used to restore the original after a data loss event". So, yes, backup is perfectly WP:PRIMARYTOPIC. The second proposal is outright preposterous. FleetCommand (Speak your mind!) 10:53, 5 August 2016 (UTC)
  • Oppose per WP:PRIMARYTOPIC. As a computer professional who administers multiple kinds of backup systems, I deal with this stuff daily, and any time someone just says "backup" they mean a data backup. If they ever mean something else, it is clarified in-context, almost always by using "backup" as modifier of something that makes the fact that this is not a data backup crystal clear (e.g. "we need a backup router", "how often is the database synched from the live mssql1 host to the mssql2 backup host?", "this rollout has a solid backup plan", etc. All the other things to which "backup" can be loosely applied have other terms, such as UPS, failover, hot-swappable drive, redundant system, etc., etc., and people use these terms for their clarity. There are other, non-computing backups, yes, back the entities who deal with them also use backup as adjective, and also deal with data backups and call them backups (noun).  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  07:30, 7 August 2016 (UTC)

The above discussion is preserved as an archive of a requested move. Please do not modify it. Subsequent comments should be made in a new section on this talk page or in a move review. No further edits should be made to this section.

Enterprise client-server backup

I owe everyone an explanation for the first substantial addition to this article in at least 5 years.

I very substantially expanded the Retrospect (software) article starting in October 2016, taking it from six short paragraphs to 10 overly-detailed screen pages. In September 2017 I was taken to task by scope_creep, after which JohnInDC made massive cuts and forced me to cut that article down to less than 2 screen pages. It ended up with 13 screen lines of built-in features, plus another 5 screen lines of extra-cost features. At the same time I proposed in the second and third substantial paragraphs here that the 15 features I had to leave out of that article, all of which are new ones added to Retrospect since 2005, belonged in a separate article named something like "Enterprise client-server backup"—without a mention of them as Retrospect features because they really are now common to all enterprise client-server backup applications. Nick Moyes suggested early in November in the Teahouse that I instead turn the new article into a new section in this article. Here it is.

Although I have taken the brief description of these features from the old version of the "Retrospect (software)" article, I have totally omitted any use of the name "Retrospect" because scope_creep considers that to be advertising. I have also in all but a couple cases, obeying JohnInDC's warning here, referenced one or another of 10 "articles that say, 'these are the important features of Enterprise Backup Software' and summarize[d] the lists they provide."

I will be entirely happy if other editors add client-server backup features that other applications have but Retrospect doesn't. I even have one NetBackup feature in mind, but I left it out so as not to have this initial version of the new section go over two screen pages. It will be up to those other editors, however, to find references in the documentation for those applications; Retrospect makes it easy because it still follows the archaic practice of having comprehensive User's Guides instead of only a bunch of Web pages. DovidBenAvraham (talk) 04:01, 15 November 2017 (UTC)

Despite scope_creep's aversion to it, I have put some terms in double-quotes. However I have only done this for terms that are not standard industry terminology. For instance, "administrator console" is a term used by at least other client-server backup application; Retrospect Mac just calls it the Console. DovidBenAvraham (talk) 05:16, 15 November 2017 (UTC)

I'll take a look at this and offer edits and comments. First off however, I think the section entitled "Enterprise client-server backup" should begin by simply stating saying what the term means, with an appropriate reference or two, rather than beginning with a lengthy quote that kind of circumnavigates the concept and has to be read through more than once to understand what the point of it is exactly. The section should begin, "Enterprise client-server backup is..." and then say what it is. JohnInDC (talk) 01:46, 16 November 2017 (UTC)
I've now done this in the section lead in terms of three challenges enterprise data backup must meet, ref'd by the Rassokhin? article (I made an educated guess on the name of the author, although Novosoft LLC is a Russian company that sells several kinds of backup services), followed by the statement—also ref'd by the same article—that client-server backup along with appointing a backup administrator is a way of meeting them. DovidBenAvraham (talk) 16:27, 16 November 2017 (UTC)
The section also needs to be written using generic language describing the various concepts, processes and strategies that are unique to enterprise backup technology, rather than listing a series of proprietary, named features that are (or are not?) unique to one particular brand of backup software on the market. JohnInDC (talk) 02:00, 16 November 2017 (UTC)
Several commonplace words like "copy" or "restore" were written with initial caps. These are not proper nouns or proprietary terms and I've changed those to lower case. I also removed proprietary terms wherever a generic term would serve as well. In several places, simple terms such as "backup server" were written within quotes, as if the terms did not mean what they said they meant; I've removed those quotes. I tried in a few instances to convert what was a pretty detailed description of Retrospect's approach to an issue with more generic, higher-level language. There are still tweaks to be made but the article now does a better job of describing the general subject area. JohnInDC (talk) 02:23, 16 November 2017 (UTC)
As I've said before in the Talk page for the "Retrospect (software)" article, my big problem is that there are no agreed-upon terms for most of these features. I left in the initial caps for terms that were in the old version of that article—not because I wanted to use proprietary terms—but because I need to define terms for purposes of this article and I thought that using initial caps was the briefest way of doing the definitions. DovidBenAvraham (talk) 16:35, 16 November 2017 (UTC)
In particular I now propose to simply put a one-sentence paragraph in the section lead defining the meaning of "media set"—without initial caps or quotes—for this section, and then put the phrase back in wherever I originally used it. If I haven't already said it, Dantz originally used the term "backup set" many years ago (and still uses it in Retrospect Windows)—but switched to "media set" in early 2009 because other backup programs had started using "backup set" for things that aren't compatible with Retrospect's definition. DovidBenAvraham (talk) 21:54, 16 November 2017 (UTC)
The fact that this is a Retrospect-specific term is a reason not to reintroduce it into the article. "Backup" makes common sense English sense. "Media set" is one company's term for what in a general article should be a general term. Please don't add it back. JohnInDC (talk) 01:03, 17 November 2017 (UTC)
One concern I have is that this is not an article about the idea of "enterprise client-server backup", but instead an article about the particular enterprise features of Retrospect, written to obscure or not draw attention to the, in fact, narrow focus of the article. It makes me a bit nervous when you suggest that other products may have other features not described here, because - well, then this really is just an article about "Retrospect" in all but name. I am not sufficiently familiar with the topic to know whether a concept described here is a general one, or a product-specific feature written in general language, but - since the article is supposed to be the former, I've assumed the former and rewritten accordingly. With my edits, this new section is shorter and more to the point, and so I think adding in that NetBackup feature would be a good idea. There's room for it now. JohnInDC (talk) 02:59, 16 November 2017 (UTC)
As a general matter, any concept that is now described using what I presume to be Retrospect's name for it (like "Automated data grooming", "High-level Dashboard" or "Monitoring System Integration") needs to be revised to describe the general function that this (probably trademarked) phrase encapsulates. It's worth noting in this regard that despite the plethora of backup programs on the market, for practically any platform one can imagine, the entire rest of the article is written without regard to any one of them. This new section should strive for and achieve that level of generality. JohnInDC (talk) 03:30, 16 November 2017 (UTC)
Those particular phrases are not trademarked. Here's an article from the physical storage firm Iron Mountain that talks about "grooming" data in connection with archiving it. Here's another from the developers of Backup Exec that talks about the "grooming" capabilities of their product's Data LifeCycle Management; I'll at least ref that in this article section. As pointed out by scope_creep a couple of months ago, there's a WP article on business dashboards; I didn't think it was a good link for the old "Retrospect (software)" article, but that's because Retrospect Inc. IMHO shouldn't have called their feature a "Dashboard" in the first place. "Monitoring System" is a just a catchall term the Retrospect Inc. tech writer came up with to cover Nagios, Slack, and IFTTT; I'll lower-case "system"—and other words—in the item headings to put your mind at ease. As to the rest of the article section having been written without regard for any other client-server backup programs, that's because: (a) I've never used any of them, (b) they seem—unlike Retrospect—to have their documentation in a bunch of Web pages instead of a comprehensive User's Guide, and (c) I can't Google for the Web page covering a particular feature if I don't know what the term for that feature is in a particular program. I've mentioned the non-standardized terminology problem before; I (perhaps idealistically) propose to solve it for the article section by letting the WP hive mind loose on it, after I've written what I reasonably am capable of. DovidBenAvraham (talk) 21:54, 16 November 2017 (UTC)
See, that's a problem. You didn't write an article about "Enterprise client-server backup" software. You wrote an article about Retrospect (whose features you exclusively described and referenced) but portrayed it as an article on the general subject. It's not fair, and it's not right, to write a narrow limited article claiming to be cover the entire field, and then wait for other editors to (maybe!) 1) detect the shortcomings and 2) remedy them. If you don't know enough about the general subject area of "Enterprise client-server backup software", or don't feel comfortable finding sufficient references to do a good job of it, then really you shouldn't be adding the material to an article at all. JohnInDC (talk) 01:15, 17 November 2017 (UTC)
I've revised the intro paragraph to remove the quote, and take the information in it to state directly what "Enterprise client-server backup software" is. It's a bit redundant still but it's clearer. I also edited the list of features to play down the operational or interface aspects that are specific to Retrospect, in favor of generic descriptions into which refs to other brands of software can (hopefully) just be dropped. The article is still, unfortunately, disproportionately driven by Retrospect and its feature set, but at least this way the reader will not come to think that all such software provides each and every feature and interface element that Retrospect does. JohnInDC (talk) 01:42, 17 November 2017 (UTC)
Except that problem (c) in my "21:54, 16 November 2017 (UTC)" comment no longer exists for one other client-server backup "program"—which actually consists of multiple interacting programs. That "program" is NetBackup, whose Wikipedia article contains a "Main features" section (and which has an old annotated features list here). That contains enough information for me to Google with; here's a table of comparative enterprise features—I had trouble trying to indent this table so I haven't:
Retrospect feature name NetBackup feature name
Improved disk-to-disk-to-tape capabilities NetBackup Replication Director
Create synthetic full backups Synthetic Backups
Automated data grooming basic disk staging
Powerful "backup server" Multi-streamed Backup
Block Level Incremental Backup Block-Level Incremental (BLI) Backup
Pre-scanning of client volumes NetBackup Accelerator
Administrator Console Java administration console and activity monitor
User-initiated backups and restores NetBackup Search and Operational Restore
"High-level Dashboard" supplementing the Administrator Console VERITAS NetBackup Operations Manager or Ops Center
E-mailing of notifications about operations to chosen recipients Error message identification, categorization and troubleshooting
"Monitoring System Integration" MIB files for NetBackup OpsCenter?
"Script Hooks" MIB files for NetBackup OpsCenter?
Avid Production Tool Support apparently not available
"Advanced network client support" Support for leading networking topologies
Cloud backup—fast uploads/downloads of massive amounts of data from/to local disk Cloud storage/connector, also Amazon AMI (run in A. EC2 cloud)
not available Multiplexed Backup
I'll start incorporating ref'd mentions of NetBackup into the section items this afternoon. I will be reverting your editing-out of "powerful 'backup server'" yesterday, since we now see that that feature precisely corresponds to NetBackup's Multi-streamed Backup feature. DovidBenAvraham (talk) 05:42, 17 November 2017 (UTC)
It would be efficient for both of us, JohnInDC, if you could give me an idea in advance of how you think one of my section items should be enhanced to show that the feature is implemented in both Retrospect and NetBackup. I would be inclined not to give NetBackup's name for the feature—thus making the item longer—if the reader can figure out the NetBackup name from the ref without an additional hint. OTOH, if there are significant differences between the two implementations of the feature, I would mention that in the item. DovidBenAvraham (talk) 07:17, 17 November 2017 (UTC)

Please add NetBackup refs to the existing text, and add back in the "multithreaded server processes" section - but call it that rather than the contentless "powerful backup server" term that was there before. Describe what it is in a sentence. Separately - we've been through this before, and I should have said it above, because it's an additional issue with the section. Your aim, your goal here, is to identify articles written by reliable third party sources that describe what "enterprise client-server backup" programs are, and how they function, and how they uniquely satisfy the particular needs of enterprise. Then distill and summarize them here, in your own words. When instead you examine different kinds of software marketed for "enterprise" and look for features that they have in common and deduce from the common feature set that these are important features of enterprise client-server backup software, and report it as such, it's SYNTHESIS and original research - that is to say, it's your assessment and view and interpretation of the subject matter area, not those of the third party sources. (This is in fact a problem with the entire article.) Don't examine additional software products for features to improve your Synthesis and OR. Instead, find more refs to knowledgeable people talking about this general area and work them in. Next. Please do not make a feature table. A feature comparison table of enterprise client-server backup programs is the province of PC World or Enterprise Computing (if there is such a thing), not Wikipedia. We have been through this before too. Further, just from a practical point of view, with only two programs the table would be sadly incomplete (and with no indication of what would complete it); and if someone were to complete it, it would probably take up three or four sideways scrolling screens. On top of which, using each software's marketing department term for essentially the same feature adds confusion, not clarity. Please. I know you know a lot about Retrospect and would like to find a place in the encyclopedia for it, but you have to stop trying to shoehorn it in. The material you've added, as now pared down, is in keeping with the rest of the article in scope and length. Please don't add needless length and complexity with what you're proposing. Thanks. JohnInDC (talk) 12:33, 17 November 2017 (UTC)

Please give me credit for having learned something about SYNTHESIS and original research over the last 2 months! That's why I concentrated last weekend on getting third-party refs (notice that my original version of the new section literally doubled the number of refs in the whole article) into my Sandbox version of the new section, rather than trying to Google for equivalent features in other client-server backup apps. If there are more third-party articles, I can't find them. I only started doing the equivalent-features Googling last night, because you pushed me on it. I also have absolutely no intention of putting the table into the article section; I only built it last night as a somewhat heavy-handed way of proving to you, in this Talk page, that my hypothesis about other client-server backup apps having the same features is fundamentally correct. DovidBenAvraham (talk) 14:06, 17 November 2017 (UTC)
Phew, thanks, hah! I do appreciate the third party sources, which I found quite helpful in revising that first section. The problem you (we) have with the rest is that if there aren't additional third party sources for what is described here, then we really are just Making Stuff Up, and, no matter how good or sensible it is, it shouldn't be here. I am not suggesting taking out what's there (though I suppose there's a pretty good argument for it), but do think that expanding would be pushing it. JohnInDC (talk) 15:21, 17 November 2017 (UTC)
Now that I have NetBackup's names for these features, I may be able to find additional third-party sources. In fact, because client-server backup apps for Windows seem to be quite competitive, I may even be able to Google other app-makers names for equivalent features. As far as Making Stuff Up goes, my original version of the new section had a third-party ref for the enterprise need for each one of the features (except one, IIRC). Besides, although I wouldn't use this as an excuse for the new section, all the preceding sections of the article seem to be—based on the paucity of refs—Making Stuff Up from the finest of IT received wisdom as of about 8 years ago. DovidBenAvraham (talk) 15:59, 17 November 2017 (UTC)
I've now replaced the section's items with NetBackup-inclusive versions that keep your edits when they don't oversimplify. I eliminated quote marks in the item headers except where absolutely necessary, and I eliminated initial-upper-casing where possible. I added a short second prgf. to the section lead defining "media set", because I decided that the section needs to distinguish between the medium that holds one or more backups and the individual backups themselves; the paragraph briefly explains why I picked that particular term (which I don't surround with quote marks after the initial definition).
I added an item for the one feature, "multiplexed backup" (which I put in quotes because NetBackup's use doesn't match any meaning in the WP article), that NetBackup has that Retrospect doesn't have. If I later manage to find refs for any features equivalent to the ones in the section in any enterprise client-server backup app, all I'll have to do is to add the refs unless the features operate substantially differently in that app.
One problem that I encountered is that, after substantial Googling, I can't find any mainstream (as opposed to Avid's own or others' video-production-oriented) backup apps other than Retrospect that support Avid production tools. Retrospect just added this support in September 2017, and I had assumed it was because Avid Technology had released some interface software that made it practical. If my assumption is correct, then other developers of enterprise client-server backup apps should soon add the same feature (if you were the CEO of a large advertising agency, for example, wouldn't you want to do something other than fervently hope that your video-editing employees/contractors are backing up their work?). If I'm wrong, however, then that feature will no longer belong in the section; therefore I'd like your advance permission to move that feature to the "Retrospect (software)" article. DovidBenAvraham (talk) 03:41, 20 November 2017 (UTC)
I've removed "media set" again. It is a term invented by and used by a single company, it adds a confusing layer of abstraction to what is, in the end, simply a "backup destination", and it is not unique to enterprise software. JohnInDC (talk) 11:33, 20 November 2017 (UTC)
I've changed all the former occurrences of "media set" to "set of backups". All the client-server backup developers used to use the term "backup set", but some of them—I don't remember which ones—started using that term for "backup source" instead of "backup destination" and one—the developer of CrashPlan IIRC—started using the term for both. That's why EMC switched. Besides, if you went to school recently enough to be exposed to set theory, sets have members and I referred to a member of a "media set" in some cases. DovidBenAvraham (talk) 19:38, 20 November 2017 (UTC)
What is wrong with using the noun "backup" to refer to the second, spare copy of the main data set? It may be technically imprecise but no more so than saying "delete" for "remove the file's entry from the master file reference", or "file" for "concatenated data locations joined by pointers" or whatever the technically precise term might be. You don't need to say "media set" (which fails entirely to convey the idea of a "backup"), or "set of backups" (which sounds like more than one backup) when you have a perfectly serviceable English word that says what the thing actually is, right on hand. Good grief, it's the title of the article. (Incidentally, the idea of "media sets" or "sets of backups" is not even an enterprise backup feature, but was common in Retrospect's consumer offering back when drive capacities were smaller.) "Media sets" and "sets of backups" add confusion without conveying any necessary additional clarity. JohnInDC (talk) 20:35, 20 November 2017 (UTC)
I didn't say "media set" was an enterprise backup feature; I use it in the Retrospect article. But a bit of Googling shows that Microsoft is using it too as of 2016. Anyway, you are suffering from an imprecision of IT terminology; we habitually use the noun "backup" to mean both a copy of a particular drive-equivalent and a run that creates one or more such copies—which we should call a "backup run". Let me give a clarifying real-world example: Saturday I ran a Retrospect script that made full backups of 6 drives on 3 different Macs. Those backups were all put onto Media Set White on a single 500GB portable USB3 drive named G-DRIVE White. Yesterday and this morning I ran another Retrospect script that put an incremental backup of one machine—the MacBook Pro on which I'm writing this—onto Media Set White. Thus as of this evening Media Set White contains 3 backups of my MacBook Pro, together with one backup of each of my other 5 drives; it is thus a set of backups. If I ran out of space on G-DRIVE White, Retrospect would call for me to mount another disk drive; that second disk drive—named e.g. G-DRIVE Ultraviolet—would also become part of Media Set White. And let me assure you; enterprises reporting their use of Retrospect have Media Sets that consist of dozens of disk drives or dozens of tapes. If I were using Retrospect Windows or using a pre-2009 version of Retrospect Mac, Media Set White would instead probably be named Backup Set White (although I could name it anything I want). You might find that confusing, but I got used to Retrospect's terminology back in 1995—when I was backing up to multiple DAT tapes. DovidBenAvraham (talk) 01:35, 21 November 2017 (UTC)
I appreciate that you have got used to Retrospect's peculiar terminology, but a Wikipedia article is not the place to propagate it further. And as I pointed out in my comments, whatever technical imprecision the term "backup" may introduce, it's sufficiently accurate for the concept that you are trying to convey. It doesn't matter whether a backup is on one disk or two tapes or in the cloud. It's a backup, a spare set of client machine data that can be restored to the client machine in case of a problem with the client. Like with file deletion, like with files themselves (which can also be spread across different drives, right?) the actual location of the logical "backup" doesn't matter. JohnInDC (talk) 02:07, 21 November 2017 (UTC)
It's not "peculiar terminology", it's terminology that copes with the fact that for many years we've been creating backups—and sets of full-plus-incremental backups—that exceed the capacity of a single disk (floppy, super-floppy, or hard) or tape. In particular your changing of "media set" to "backup" creates nonsense when applied to items which require mention of "member", such as "User-initiated backups and restores" and "Cloud backup—fast uploads/downloads of massive amounts of data from/to local disk". IMHO you should review the mathematical concept of sets, which is now taught in high school but wasn't back in the 1950s (I had to learn it when I dropped back into college to do an accelerated Computer Science major at the age of 48). DovidBenAvraham (talk) 06:06, 21 November 2017 (UTC)
I get it. I get that a logical backup may exist physically in several places. I am making the point your technical precision comes at the expense of clarity, with no corresponding gain. It's pedantic. The article can easily be rephrased to avoid nonsense, as I've just done with "user-initiated backups & restores". Also please stop putting quotes around ordinary phrases like "backup server". JohnInDC (talk) 14:08, 21 November 2017 (UTC)
More generally, I expect to remove some of the more detailed descriptions of the software features that you've added back in. In a section about the particular characteristics of this kind of software, we don't need, and don't want, a verbal description of (e.g.) how one software's GUI panel works. It's a high-level, abstracted discussions. Say what a feature is and why it's there. If only one application has it, then it may not be a feature of the product category at all. JohnInDC (talk) 11:43, 20 November 2017 (UTC)
Consider too that despite the hundreds (thousands?) of non-enterprise backup programs that are available for MacOS, Windows, Linux and other OSes, the rest of the article doesn't once detail how one or another application accomplishes a particular task. It just describes the task or characteristic and moves on. There is a lot to criticize about the refs (not) employed elsewhere in the article, but the concepts are presented simply, and in clear prose. Aim for that. I am struggling to think of a good reason why any of these high-level concepts needs to be illustrated by precisely how Retrospect, or NetBackup, or any other app accomplishes the task. JohnInDC (talk) 12:39, 20 November 2017 (UTC)
I've now rewritten the section's lead paragraph, which—probably in your zeal to eliminate the mention of World Backup Day and my associated long quote—you left as an ungrammatical mess. The last sentence of the rewritten paragraph emphasizes the key role and position of the backup administrator. The fact that, almost always, the backup administrator's position is as a member of office administration rather than IT is the justification for all the features in the User Interface subsection. In case you have forgotten the argument over the contents of the "History" section of the "Retrospect (software") article, here's the suffering that the typical Retrospect Windows backup administrator—who doesn't have access to the server room—has to go through because he/she has no Console. DovidBenAvraham (talk) 03:58, 23 November 2017 (UTC)
In case you don't understand what "adhere to legal requirements for the maintenance and archiving of files and data" means, let me tell you a personal horror story. My mother died in 1994, and I used inherited money to pay off my mortgage. About a year or so later, I discovered that the official NYC property records office still listed the mortgage as a lien against my apartment. I was then told by an insurance company executive I tracked down "Our Midwestern subsidiary bought your apartment mortgage from the bank that issued it, but the mortgage records were stored in a basement that was flooded by the Mississippi river—so they've been destroyed; despite that, after verifying it with the bank (which had continued to service the loan after selling the mortgage to the insurance company), we'll notify the official NYC property records office that your mortgage has been paid off." The insurance company did for me what he said they would do, but it makes me sick to think of the problems the other thousands of people whose mortgages were bought by the insurance company's Midwestern subsidiary must have had. IMHO this incident bears some resemblance to the Wells Fargo account fraud scandal, except that there was no fraud involved (unless the insurance company executive was lying to me). In that scandal (just keep reading the article linked-to in the preceding sentence), the Wells Fargo CEO was forced into retirement and the bank "repaid fraudulent fees and paid damages to those affected" to the tune of $110 million. Wells Fargo fired 5300 employees, and then had to "rehire some 1000 employees who had either been wrongfully terminated or who had quit in protest of fraud." In the case of the Midwestern insurance company subsidiary whose basement flooded, there was no fraud (I hope)—but I'd hate to be in the shoes of that company's CEO if the loss of records had happened 20 years later. And that's why enterprises these days appoint backup administrators, and don't let them be part of IT; and that's why I put the last sentence into the lead paragraph. DovidBenAvraham (talk) 05:59, 23 November 2017 (UTC)
In my rewrite of the lead paragraph of the section, I put back a brief mention of World Backup Day. I did so because the new section subsumed the section that mentioned it, and because FleetCommand said "As far as I can tell, what's written here is directly related" when you tried to delete that section at 1:22 on 14 October 2017‎. I have not put back in the long quote from the Chris Preimesberger article, even though it is directly relevant to that paragraph, because you don't like quotes. I think the first two sentences are somewhat redundant, and could probably be condensed into one sentence, but—other than grammar corrections—I left those sentences as you wrote them. DovidBenAvraham (talk) 00:15, 24 November 2017 (UTC)
Early this morning I took a look at the Backup Exec article (8 pages, JohnInDC, that even include a discussion of how Backup Exec compensates for not having "Multiplexed backup"—which you disallowed for the Retrospect article!), and discovered a ref to a site called Helpmax.net that has an extensive hierarchy of documentation pages for that enterprise client-server backup app. I immediately captured a low-hanging fruit by putting into the "User interface" sub-section of this article a ref to Backup Exec's "Administration Console", modifying the description of that item to include NetBackup and Backup Exec layout (Backup Exec's layout is superficially different from Retrospect's and NetBackup's) and terminology variants. I expect to find that Backup Exec also validates my basic hypothesis that justifies this article section, as expressed in my "04:01, 15 November 2017 (UTC)" comment and exemplified in my "05:42, 17 November 2017 (UTC)" comment above, so I'll be adding more refs and modifications to feature descriptions to this section in future weeks. BTW, somebody may object that both NetBackup and Backup Exec are currently owned by Veritas. However the history is that the two apps were separately developed by different companies (I actually owned and used the "MaynStream" ancestor of Backup Exec from 1989 to 1992, when MaynStream proved unable to restore a backup of my wife's Mac I had just made on my Maynard Electronics tape drive) that were separately acquired by Veritas Technologies before it was acquired—and spun off 11 years later—by Symantec. The two apps have at-least-superficially-different features and terminology, so I consider it appropriate to treat them as separate examples in the section. DovidBenAvraham (talk) 19:32, 10 December 2017 (UTC)
A few of you may be wondering why today I deleted, from the "Pre-scanning of client volumes" item in the "Performance" sub-section, the sentence "However, for one application as of 2012, the apparent implementation of this feature actually somewhat increased backup time for a remote source" and substituted a different Backup Exec ref. The answer is a poster-child illustration of the differing-terminology problem that is the bane of this section. It turns out that "pre-scanning" in Backup Exec refers, not (as I had thought) to the speeding-up feature that this item discusses, but to a different feature that actually slows down backup runs—especially for remote clients. Backup Exec—like Retrospect and NetBackup—can now use the Windows USN Journal (and, at least for Retrospect, FSEvents on the Mac). It's just that by 2012, when all the enterprise client-server backup developers discovered they could use the journaling facilities of newer OS versions to eliminate time-consuming preliminary scans in doing incremental backups, the Backup Exec developers had already given the "pre-scanning" name to their pre-existing feature. I'd love to change the name of this feature in the article to "Instant scanning", which IMHO precisely describes it, but JohnInDC and scope_creep would undoubtedly object because "Instant Scan" is what Retrospect Inc. named the feature in their User's Guides. DovidBenAvraham (talk) 21:38, 7 February 2018 (UTC)
I decided I had to change "Pre-scanning of client volumes" to "'Instant' scanning of client volumes", to avoid confusion with the Backup Exec name for a different feature. FYI "'Instant' scanning of client volumes" is a feature that speeds up an incremental backup run, by making it unnecessary to first compare the headers of all the files on a source disk to a list of files already backed up (called a Catalog File in Retrospect) onto the destination set of backups. OTOH Backup Exec documentation describes as the "Pre-Scan option" one that can slow down a backup run, because the "Pre-Scan option in Backup Exec will cause the job to first look at all data to be backed up prior to actually going to backup that data so it can create a percentage bar for the job." I used the term "'instant' scanning", similar but not identical to the Retrospect term, because neither NetBackup nor Backup Exec has a term for their use of the USN Journal or FSEvents. I've changed the Retrospect article's use of this feature name accordingly. DovidBenAvraham (talk) 00:52, 11 February 2018 (UTC)

"Disk volumes—known as sources—whose files are backed up"

I've reworded this section. The caption was problematic, with a parenthetical and to my eyes superfluous embedded definition. I'm also not quite sure what it meant to say, what concept it was trying to convey that was distilled from the two items that followed. It looked like, "making sure the data doesn't change while you're backing it up" and so I took a crack at something else. I also tried to say a little more clearly what "script hooks" do. It doesn't say anything really in the one third-party source that's cited, and so I reworked the sentence about Retrospect's capability. This entry still needs tweaking - I don't think for example we need to describe the precise way that our two selected applications accomplish this task, but rather just say what the general need is within the "enterprise client server backup" sphere and be done, particularly since the NetBackup description is unclear ("provisions for") and the cite to the manual (essentially a list of software) isn't enlightening. JohnInDC (talk) 11:38, 4 December 2017 (UTC)

I also don't think that Avid production support needs to be singled out. First it's not clear what's problematic about that one application that it needs its own special backup processes. Second, the supporting reference is (ultimately) to a Knowledge Base article that literally just says, "we now support Avid storage devices" and then lists general functions. We should just include the idea for Avid under the more general discussion, identifying the problem that "the files change a lot as you are trying to back them up" and then capturing it with a general statement, like, "This class of software makes provisions for keeping source file data static while it is being backed up", via scripts or provisions for particular databases or software applications". Right now it looks like this section is being defined more by the Retrospect manual and product capabilities than anything the industry is saying about itself. JohnInDC (talk) 11:53, 4 December 2017 (UTC)
If you'd just clicked the first three words of the bolded item title on the "Avid production tool support" item, you would have been transported to Media Composer (I changed the link yesterday from Avid Technology to make it clear to the truly ignorant). There you would have been greeted with "Since the early 1990s, [Avid] Media Composer has been the dominant non-linear editing system in the film and television industry, first on Mac and then also on Windows." That means that practically every professionally-made film, TV show, and commercial you've watched for the last 25 years—unless you're an old-movie/old-TV fan—has been edited with an Avid production tool. Even I knew that when Retrospect added the feature in September 2017, and I'm not a video person. If you'd then searched for "Avid" in this Talk page, you would have found my "03:41, 20 November 2017 (UTC)" comment. The core point of that comment is "If my assumption is correct, then other developers of enterprise client-server backup apps should soon add the same feature (if you were the CEO of a large advertising agency, for example, wouldn't you want to do something other than fervently hope that your video-editing employees/contractors are backing up their work?)." The first sentence of that comment says "... after substantial Googling, I can't find any mainstream (as opposed to Avid's own or others' video-production-oriented) backup apps other than Retrospect that support Avid production tools." The last sentence of that comment says "If I'm wrong [in my assumption], however, then that feature will no longer belong in the section; therefore I'd like your advance permission to move that feature to the 'Retrospect (software)' article." So, even though I don't think your revised subsection title "Source file integrity" covers Retrospect's "Avid production tool support" (their name for the feature, not mine), I'll let it stand because that subsection may eventually only include "Script hooks" (their name again, not mine)—for which you revised subsection title is appropriate. DovidBenAvraham (talk) 13:14, 4 December 2017 (UTC)
Your mistake this morning, which resulted in your replacing a clunky subsection title with an inappropriate one, shows that the idea of looking for a link in the bolded item header of a description list is not obvious to all readers. Therefore, to the sentence "Retrospect also supports several enterprise client-server backup features" that begins the "Enterprise client-server features" section of the "Retrospect (software)" article, I intend later today to add a parenthetical clause that essentially says "To see the description of the features in a particular list item below, click on the bolded item header." I know that you will feel that "click on" violates some Wikipedia stylistic rule, so I'll use some euphemism for that term. You may want to mess around with the wording of that parenthetical clause, but I hope you will agree—since I'm adding the clause in your honor—that the clause itself is necessary. DovidBenAvraham (talk) 14:23, 4 December 2017 (UTC)
I know what Avid is. I was asking what, among the hundreds or thousands of programs used today in "enterprise", makes backing up Avid so different that it requires its own special accommodation; and if so, what that accommodation might be. The wikilink doesn't answer that. JohnInDC (talk) 14:30, 4 December 2017 (UTC)
What concept was the original caption trying to capture - what did it mean, and what did the two entries have in common? JohnInDC (talk) 14:32, 4 December 2017 (UTC)
I'll answer JohnInDC's second question first. What—in my mind—the two features have in common is that they have something to do with the enabling conditions in which a source backup takes place, not with how the backup is done ("Performance") or how the backup is controlled ("User interface") or the network on which the backup is done ("LAN/WAN/Cloud"). One of the two features ("Script Hooks") enables "pausing applications that might change source data during a backup"; the other feature ("Avid production tool support") enables backups of Avid Media Composer files, whose apparent special nature I'll allude to in my next paragraph. I thought the caption "Disk volumes—known as sources—whose files are backed up" captured that commonality reasonably well, but someone else might be able to think of a better caption. In any case the caption "Source file integrity" doesn't seem to cover the feature "Avid production tool support", and I'm wondering whether JohnInDC is so in love with that caption that he wants to eliminate mention of the feature that it doesn't cover.
The answer to JohnInDC's first question is that I don't really know, but—according to the referenced Retrospect Knowledge Base article—"For Retrospect to see the Avid volume, you need to use Avid’s utility to mount the volume on the file system where Retrospect is running. After mounting it, the volume will appear in Retrospect’s list of sources, and you can add it to the above [Backup, Copy/Duplicate, Archive, and Restore] scripts." For further hints as to "what ... makes backing up Avid so different that it requires its own special accommodation", JohnInDC could try reading the remainder of the Media Composer article and the Non-linear editing system article it links to. But I haven't tried to do that, because—whatever that reason is—all I need to know for the purposes of this article is that it takes a special Avid utility to enable it on a non-specialized backup system and that—so far—only Retrospect has implemented the use of that utility in a backup application that is not specialized for video production purposes.
To emphasize why backing up Avid files might qualify as an enterprise feature, let me close with another one of my heavy-handed examples from a slightly different branch of commercial art. Recently a backup administrator from a foreign country posted to the Retrospect Inc. Forums. He worked for a company that went into that country's equivalent of Chapter 11 bankruptcy, and with his boss's consent was given all the Retrospect backup tapes to continue the work with the same clients. He designs album covers, and often needs to retrieve old files from this system if the album is going to be re-released or released on a different format (vinyl, deluxe CD etc). JohnInDC should be able to understand from this that (a) without backups a piece of computerized commercial art becomes dead—impossible to modify without highly-expensive restoration, and (b) why the backup administrator of an enterprise would therefore be justifiably concerned with seeing that the enterprise's computerized commercial art—such as video—is reliably backed up. Retrospect has always had the ability to backup ordinary graphics files, but apparently until now has not really had the ability to backup Avid files. That's why, even though—based on my quick Google search—no other general-purpose enterprise client-server backup app yet has the ability to backup Avid Composer files, I tentatively listed "Avid production tool support" as an enterprise backup feature. DovidBenAvraham (talk) 19:03, 4 December 2017 (UTC)
The ability to back up Avid files doesn't sound conditional or contingent or anything. It's just a kind of file or dataset that is (apparently) hard to back up, which Retrospect has made a special case out of. Script hooks are characterized by "when" and Avid support by "what", and it seems a stretch to put them into any meaningful category together. I'm also not sure why Avid support constitutes a uniquely "enterprise" backup feature. The problem is with the the software data structure (or something else unique to Avid), not with the specialized backup needs of the firms that use it. It's a Retrospect feature, not an enterprise solution (which as you note you've suggested before might be the case). JohnInDC (talk) 20:23, 4 December 2017 (UTC)
OK, the first paragraph of my "19:03, 4 December 2017 (UTC)" comment may not have been well-phrased. The "when" condition for Retrospect's script hooks is that they can ensure that backup of a particular database system occurs only after the database system has been quiesced—to do otherwise results in a real mess. The "when" condition for Retrospect's Avid production tool support is that without it Retrospect can't see an "Avid volume", and any attempt to backup Avid files copied onto another volume likewise results in a real mess. Here (see especially the 4th post) is one discussion, a few years old, of what kind of mess you get into without using a special Avid-oriented backup app—the thread recommends a particular recently-made-free one. Here (see especially the first paragraph is the topmost page, on an otherwise Russian website, for the original free special Avid-oriented backup app. Here is the IT-backup-vs.-app-being-marketed comparison page for a definitely-not-free special Avid-oriented backup app; here follow embedded links for the ISIS hardware and Avid NEXIS (sorry, no WP article for this) "Avid volume" storage hardware that comparison page refers to. Avid Technology has evidently written a utility that makes an "Avid volume" visible as a Retrospect source, and Retrospect Inc. must have gone to some effort/expense to duplicate (or buy) the features of the free apps that enable proper backup of that "Avid volume" once it is visible.
As to why Avid support IMHO constitutes a "uniquely 'enterprise' backup feature", think again about the large advertising agency I mentioned in my "03:41, 20 November 2017 (UTC)" comment on this Talk page—some of whose employees/contractors are producing video commercials. It is certainly possible that those employees/contractors are regularly backing up their Avid projects, using one of the specialized backup apps mentioned one paragraph above. However no competent enterprise backup administrator, whose role is defined in the first paragraph here as being "the keeper of the data", would rely on those employees/contractors to do their own backup of such important data. As we all know, many computer users have good intentions about doing regular backups, but don't follow through on their intentions. I conjecture that Retrospect Inc. personnel did at least an informal survey of their administrator customers, and found that a capability for routine backup of Avid projects was high on the list of desired features. Since the backup administrator is ipso facto an employee of the enterprise, that would make "Avid production tool support" an enterprise client-server backup feature. The proof of that conjecture should come in 6 months to a year, when makers of competing enterprise client-server backup apps announce their own "Avid production tool support". If they don't, I promise to move the feature to the "Small-group features" of the Retrospect article; until then, I intend to keep it in this article. DovidBenAvraham (talk) 02:05, 5 December 2017 (UTC)
So it's Avid-specific, and "enterprise" only in that many / most people desiring this feature are working at a firm. If a freelancer wants to back up Avid files, they'll need this feature too (whereas they won't need other "enterprise" features such as multi-terabyte hardware devices to upload data to the cloud, or administrator consoles that permit non-technical personnel to run backups). I'm still left unclear about the conceptual category that includes both of these things - script hooks delaying backups until the data to be backed up is static and internally consistent, and Avid support because the files are an incoherent mess to standard backup programs. I don't think these features fit together, and I don't think Avid support really stands as an enterprise feature (perhaps not even with another firm's adding such support). Rather than leave Avid in, speculatively, until some indeterminate point at which maybe it's been long enough without another firm coming in that we decide it's time to take it out, it should stay out until another firm introduces it and then we can have a fact-based discussion about what to make of it. Taking Avid out would solve both the problem of the caption and the crystal ball aspect of its inclusion today. JohnInDC (talk) 02:21, 5 December 2017 (UTC)
I'll admit that, although I have a dinky little home installation of Retrospect Mac, I use a couple of "enterprise" features. I use pre-scanning of client volumes; it saves me about 8 minutes on a daily incremental backup of 1 drive, and about 40 minutes on a once-a-week full backup of 6 drives (of which 3 are inside a Mac so old that its Retrospect Client software predates pre-scanning). I also of necessity must use the Administration Console, since the Retrospect Mac backup server software has not had a built-in GUI since early 2009. From 1995 to 2010, I used older versions of Retrospect Mac that had a non-multithreaded GUI built into the backup server software; the non-multithreading was a pain in the butt, but the backup server Mac sat in my bedroom as it does now. Does that make those two features non-"enterprise"? I say no; pre-scanning—as the third-party article which that feature references says—was specifically developed so enterprises could squeeze more backups of client computers into a nightly "backup window", and the Administration Console was designed so that it could run on another Mac that is outside the locked room where the backup server sits in a true enterprise.
As for Retrospect's Avid support, I doubt most freelancers would use it. AFAIK they would have to have Avid ISIS or NEXIS storage hardware to use the Avid software, but they could copy their Avid files onto another drive and then use the free specialized software mentioned above to back it up. Moreover they might not even have to do the preliminary copying; I doubt Avid Technology developed its utility app just for use in Retrospect, so the freelancers could simply use the utility to make the Avid drives visible on a regular Windows or Mac computer and run a free backup app directly from that. Retrospect does cost $119 even for the lowly Desktop Edition I use. I stand by my belief that Retrospect's Avid production tool support was developed because backup administrators in enterprises asked for the feature. DovidBenAvraham (talk) 04:05, 5 December 2017 (UTC)
You may well be right but for now it’s speculative, unsourced, personal opinion and OR. JohnInDC (talk) 04:26, 5 December 2017 (UTC)
As JohnInDC has requested, I've moved "Avid production tool support" to the "Small-group features" section of the Retrospect article. If other enterprise client-server backup apps add this feature, as I predict, we'll have a discussion about moving it back into this article. DovidBenAvraham (talk) 13:05, 5 December 2017 (UTC)
Your rewording of the section made a fundamental error in redefining "script hooks", JohnInDC. Retrospect "script hooks" are hooks on which to hang scripts that are executed at specific points during administrators' backup strategy lifecycle. They are not the scripts themselves; Retrospect Inc. has provided a number of pre-written scripts to be executed by one or more of the "script hooks", but an administrator can write his/her own scripts—and I've mentioned in my "01:40, 30 November 2017 (UTC)" comment here that I know one administrator who has. I'll fix what you messed up. DovidBenAvraham (talk) 02:48, 5 December 2017 (UTC)
Thanks for fixing that. I haven't seen what you've done yet but you could probably state it pretty cleanly, as "the ability to execute a series of predetermined commands at different points in the backup process, for example to pause execution of programs that would write data to the files being backed up". It doesn't matter for these purposes whether the scripts are prefab or administrator-defined. JohnInDC (talk) 03:23, 5 December 2017 (UTC)
I've slightly enhanced your version, JohnInDC; it again states that scripts activated by script hooks are written in standard scripting languages (meaning e.g. VBS or Bourne shell script)—rather than in Retrospect's own scripting GUI, and that this is (AFAIK) only a feature of Retrospect—although NetBackup accomplishes the same source file integrity result via client add-ons for many different database and other systems. DovidBenAvraham (talk) 13:05, 5 December 2017 (UTC)
I've met JohnInDC's objections to my recent rewrite of the section by rewriting "Script hooks" again as "Backing up interactive applications", emphasizing the pausing/un-pausing of interactive services while providing a clear explanation of what Retrospect "script hooks" are. I paraphrased—with somewhat greater length but greater precision—the quote from the newly-discovered second Rassokhin article, but left the article in as a third-party reference. I've left out any italics because JohnInDC doesn't like them; I thought they were useful in emphasizing the distinction between NetBackup and Backup Exec's built-in pausing/un–pausing provisions and Retrospect's use of "script hooks", as well as the distinction between a Retrospect backup script (written in its GUI) and an external "script hooks"-using script (written in a standard scripting language). DovidBenAvraham (talk) 04:41, 25 January 2018 (UTC)
I had changed "pause" and "un-pause" to "quiesce" (with link) and "relaunch", which are the terms used in the Retrospect Mac 14 User's Guide that introduces the "script hooks" feature. However JohnInDC reverted that; he said "Use Common words where they will do." That's rather ironic, since scope_creep complained, in his/her rather incoherent "09:55, 13 October 2017 (UTC)" comment here, that "IT companies, don't talk in computer terms. All the time, they are driven by branding and marketing. The software guys don't like it, but it is a fact of life." I had changed from "pause" to "quiesce" because I wanted to be more technically precise—remember that I've never done systems programming on the level of the Retrospect engineers. I've now compromised by giving "paused" a technical WP link and "unpaused" a technical Wiktionary link (with no internal dash, because "unpause" is what's in Wiktionary). I made the same change in the Retrospect article. I hope that makes both JohnInDC and scope_creep happy. DovidBenAvraham (talk) 23:48, 27 January 2018 (UTC)

"Storage media" updating in 2018

Oh my. Whenever JohnInDC lets his fingers wake up before some other part of his brain (the link-following part in this case), I have to get heavy-handed on this Talk page. So I'm going to discuss this now before I rewrite my "03:52, 18 April 2018" edit to the article, so as not to waste effort when I redo the actual edit to meet JohnInDC's objections.

Yes, "unsourced comparison of RDX and standard hard drives that in the end says they're the same" is correct, because the first paragraph of RDX_Technology says the mechanisms in the cartridges are the same. The sentence in that paragraph that begins "RDX cartridges are shock-proof 2.5-inch hard disk drives ...." was added to the article in 2011 as [2]. The shock-proof capabilities of ordinary 2.5-inch portable disk drives have improved since then; I previously found a specifications article by one of the three major HDD manufacturers saying its 2.5-drives can survive a 1-meter drop onto a tiled or concrete floor—even though they're not in RDX cartridges. I'll find a couple of such articles again (which I couldn't last night), and use them as references; I just thought using the Wirecutter article made the same point.

That same 2011 edit deleted a section headed "Archiving capability" which described a test of RDX cartridges vs. standard 3.5-inch HDDs that predicted the cartridges' "... media lifespan in typical archival environments is at least 30 years with 99% reliability."; the section was deleted by some editor who considered it to be advertising. I'm not seeing that claim on either of the websites of two companies that used to manufacture RDX "docks".

In fact it seems that no company is manufacturing RDX "docks" any more, which explains why 3 out of the 5 refs for the RDX_Technology article are now dead. It appears that Overland Tandberg and Dell are still manufacturing RDX cartridges with gradually-larger capacities, evidently to satisfy the needs of backup administrators who already own RDX "docks". To find out why backup administrators bought such "docks", I suggest that you read my "02:48, 17 April 2018 (UTC)" comment in Talk:RDX_Technology. The short version is that such administrators (or their bosses) were seduced by either the "shock-proof capabilities" of the cartridges—which I have pointed out above are no longer a genuine advantage—or what I will call the "butterfingers/butterbrains-proof" capabilities of RDX technology. The "butterbrains-proof" capability may still be an advantage if you have backup administrators who think of RDX cartridges as "tapes"; see my "21:09, 15 April 2018 (UTC)" comment on that Talk page. However if the "butterfingers-proof" problem still applied to enterprise backup administrators—who as I described their responsibilities in the last sentence of the lead paragraph here have more than a 50% probability of being women—otherwise needing to swap USB3 cables between 2.5-inch portable HDDs, then there would be a roaring industry supplying electro-mechanical devices to plug hair dryers into wall outlets—an industry which AFAIK doesn't exist (that's a joke, BTW). DovidBenAvraham (talk) 16:56, 18 April 2018 (UTC)

If RDX technology is obsolete and the company moribund, and current standard hard drives aren't any better or worse, then we don't need a paragraph, or even a sentence, comparing them. Better - for both lay readers as well as enterprise backup administrators who still think it's 2011 - simply to remove the needless and outdated comparison altogether. That's what I did. JohnInDC (talk) 17:44, 18 April 2018 (UTC)
And - if we were going to compare them, and say that current disks perform about the same, then we need a source that says, "current hard drives are as robust as ones manufactured using RDX technology", and not synthesize that conclusion from sources that do not, in fact, supply that conclusion. JohnInDC (talk) 17:48, 18 April 2018 (UTC)
The problem is the pre-existing sentence "The main disadvantages of hard disk backups are that they are easily damaged, especially while being transported (e.g., for off-site backups), and that their stability over periods of years is a relative unknown." That was certainly true in May 2009 when it was added to the article, and has a certain degree of truth now. I have decided to leave that sentence in—since I hate to destroy previous editors' work, but to change it to past tense and counteract it with a couple of subsequent sentences added to the same item. The first of those sentences restates and amplifies ("a degree of waterproofing"—which has never been claimed for RDX cartridges) what you objected to before, and refs it with the only manufacturer statement I could find that explicitly states the military drop test that manufacturer's drive allegedly passed. I say "allegedly", because I've reused the ref you objected to as "sources that do not, in fact, supply that conclusion" in the next sentence to say that as of 2018 Wirecutter found that three other "rugged" portable hard drives did not survive even as many 4-foot drops as they were supposed to—much less 29 minutes at the bottom of a 3-foot-deep pool. Moreover the two companies that still manufacture RDX cartridges are not, in fact, moribund (Dell moribund?); they just don't seem to manufacture RDX "docks" anymore. DovidBenAvraham (talk) 01:49, 19 April 2018 (UTC)
It's extraneous clutter. If hard drives no longer suffer comparative fragility, and their long-term stability is now certain, then the section should be revised to say that (or omit the observation altogether). If they do still suffer comparative fragility, then the whole digression into the comparative (conjectural?) advantages of different technologies - which concludes that there isn't any difference in the end - is just clutter and should be omitted, as I've done again. Please leave it as is. Thanks. JohnInDC (talk) 02:52, 19 April 2018 (UTC)
And, again, if you are going to introduce a discussion of these advantages and disadvantages, it can't be based on your conclusions, based on your review of the literature and test results. You need a third party source that says, "RDX was designed to address some of the shortcomings of spinning hard drives, but the latter have improved a lot over the years and nowadays they're pretty much on a par" - or whatever point it is you intended to make. Please, please don't mash together information from another Wikipedia article (which isn't a "source") with data you've gleaned from manufacturer-claimed product specs and an online testing website (even an excellent one like Wirecutter) to arrive at your own analysis. That's the definition of synthesis, which isn't proper in Wikipedia articles. JohnInDC (talk) 03:22, 19 April 2018 (UTC)
I read the Wirecutter article in more detail, and large parts of it are devoted to the persistent problem of hard drive fragility - it's full of caveats like, "[P]ortable drives are designed to withstand a little more abuse, though one bump or drop can still lead to failure", "don’t buy a rugged portable hard drive", and "if you’re concerned about dropping your drive, you should consider a solid-state drive. SSDs cost more, but they offer faster performance, and your data won’t fall victim to butterfingers." The article doesn't begin to suggest that hard drives have surmounted the fragility issue, and it can't be cited for that proposition. JohnInDC (talk) 03:46, 19 April 2018 (UTC)
So you have chosen to leave in the sentence "The main disadvantages of hard disk backups vis-a-vis tape are that they are more easily damaged, especially while being transported (e.g., for off-site backups), and their stability over a period of years is relatively unknown." That sentence came with no reference when it was inserted into the article, which as I have said above was in 2011. My friend's personal experience using the same set of 3 ordinary USB3 portable HDDs has now lasted since May 2015. He has on a few occasions dropped a drive onto a table from a height of 6 inches, although not onto a hard floor underlaid by concrete from 3 feet. He reuses each drive every 3 weeks, but has restored from contents backed up 2.5 weeks ago on several occasions. My friend transports one drive a week to—and another drive from—his safe deposit box at a local bank branch by carrying it in his pocket, which admittedly isn't as rough on the drives as carrying them on the floor of a car would be. He has Retrospect do a verification of every backup using either an MD5 digest (for clients) or a byte-by-byte comparison (for locally-attached drives), and has never had a verification difference that wasn't explained by the file having changed between backup and verification. In short, my friend—who does not consider himself to be an exceptionally lucky person—says that sentence needs to come out or be extensively qualified. He thinks the additional sentences I added, and which you deleted, constitute the necessary qualification. DovidBenAvraham (talk) 08:41, 19 April 2018 (UTC)
Or, looking at the same questionable sentence from another point of view, consider my "19:32, 10 December 2017 (UTC)" comment further up this Talk page. In it I describe the complete failure to restore from a tape backup made an hour before (with no verification, because I didn't then know I needed to do that) in 1992 by the ancestor of Backup Exec. Shouldn't we add an equivalent sentence to the "magnetic tape" item in that same article section, saying tape backup is inherently unreliable? The answer is that of course we shouldn't, because the introduction of the DAT/DDS tape drive by 1996 (four years after I bought my original tape drive)—which I used for 14 years while verifying every backup—was a significant advance in reliability over Maynard Electronics' version of the QIC tape drive. My qualifying sentences simply say that there has been an equivalent advance in optional portable hard disk reliability in the four years since 2011 and 2015. DovidBenAvraham (talk) 09:08, 19 April 2018 (UTC)
By now you should understand that your friends' personal experiences, and opinions about the content of Wikipedia articles, are neither proper sources nor a basis for including things here. If hard drives have advanced to the point to which they are as reliable, durable and as long-lasting as tape, then find a source that says so and cite it. In the meantime, please stop synthesizing conclusions. Thanks. JohnInDC (talk) 11:03, 19 April 2018 (UTC)
Actually when you say in your "03:46, 19 April 2018 (UTC)" comment "The article doesn't begin [my emphasis] to suggest that hard drives have surmounted the fragility issue", you're flat out wrong as far as drop-testing is concerned. At the end of its "Don’t buy a rugged portable hard drive" section, the Wirecutter article article I had used as a reference says the following: "The A80 and the A65, which we were able to reassemble, survived 16 and 17 drops, respectively. Our computer stopped recognizing the A80 after its 17th impact, and the A65 refused to connect after its 18th collision with the floor. (We opened the drives again to look for any loose connectors we could fix, but no dice.) In short, neither drive we drop tested survived the abuse they’re advertised to endure [my emphasis]." Even if 16 or 17 drops is not as much as the advertised MIL-STD-810G-516.6 requirement, I think it's obviously no longer true that those portable HDDs "were easily damaged, especially while being transported (e.g., for off-site backups)". I think I should be allowed to put back in at least the first qualifying sentence I had added, minus the clause about waterproofing. That clause talked about a damage issue not mentioned in the 2011 sentence, and for good reason. Have you any idea what even 5 seconds of immersion does to the readability of a tape? I do, because in 1967 one of my computer-service-bureau clients phoned me to say that he had dropped his rented-from-us backup tapes into a puddle as he got out of his car in the parking lot of his defense-industry employer. The immediate answer from my computer operations colleagues was that he now owed our service bureau the full price of those tapes, which they said he was welcome to keep although they would be useless as backup. As for my second qualifying sentence, the third sentence in the first paragraph of the RDX Technology says those drives are 2.5-inch serial ATA hard disk drives—evidently made shock-proof by their enclosing cartridges. The RDX cartridges "are advertised [evidently by Tandberg Data GMBH—my emphasis] to sustain a 1 meter (39 in) drop onto a concrete floor and to offer an archival lifetime up to 30 years." DovidBenAvraham (talk) 11:25, 19 April 2018 (UTC)
I will simply note again that, when you draw your own conclusions from information set forth in different sources, to support things that the sources do not actually say, it is synthesis and not proper for articles. WP:NOR generally also has some useful guidance in that regard. Thanks. JohnInDC (talk) 11:30, 19 April 2018 (UTC)
Here for example is a 2016 reliable source that discusses pros and cons of different media, and which lines up reasonably well with what the text already says: https://www.pcworld.com/article/2984597/storage/hard-core-data-preservation-the-best-media-and-methods-for-archiving-your-data.html Hard drives: "Fast compared to tape and optical, hard drives are generally reliable for the short term"; "A hard drive is also a mechanical device that’s vulnerable to shocks. You can do everything right with your drive, but drop it on a hard floor as you pull it out of the safety deposit box, and like that, you’re off to the recovery service." It's a consumer-oriented article, and so says less about tape, but offers these observations, among others: "Magnetic tape is still in the discussion for enterprise. It’s available in very large capacities" and "It also suffers magnetic and physical degradation over time, though the rate is greatly dependent upon the materials in use." There are others, which pointed out the pros and cons in even greater detail, but I think a few sentences for each is all this article can reasonably bear. I haven't got time at the moment to add references and align the text to them, but it is certainly doable. Any of that is better than synthesis and the opinions of friends. JohnInDC (talk) 11:48, 19 April 2018 (UTC)
I've now simply split the last sentence in the "Hard disk" item after JohnInDC's deletions into two firmly supported sentences; one discussing the transport damage disadvantage, and another discussing the stability disadvantage. This has enabled me to add a clause in the transport damage sentence that says rugged enclosures can partially combat transport damage, a clause which references the Wirecutter experimentalnot inferential—results mentioned in my "11:25, 19 April 2018 (UTC)" comment. It has also enabled me to incorporate the supporting refs newly provided by JohnInDC without as much "ref overload"; I even reused his Forbes ref for "Magnetic tape", where it supports what was formerly a link inference that by 2014 there were only 3 remaining "super" formats with LTO in the lead. There's no mention of moribund RDX technology, and there's no mention of waterproofing. DovidBenAvraham (talk) 22:06, 19 April 2018 (UTC)

Works for me, thanks. JohnInDC (talk) 22:11, 19 April 2018 (UTC)

I just took another look at the second ref I added to the RDX Technology article. A 2013 white paper from Tandberg Data, it says "The same 2.5-inch drives are most often used in laptop computers due to their size and locking head feature. With its protective, shock-proof cartridge design, the RDX QuikStor cartridge passes drop tests in excess of one meter onto a tiled concrete floor without damage. Archivability and Reliability: Small-form-factor HDDs like the ones used in RDX QuikStor are specifically designed to significantly improve their mechanical reliability and life. Design features such as ramp-load heads and fluid dynamic bearings eliminate any concern about head-media contact or disk sticking. In fact these mobile HDDs now boast a mean time to failure (MTBF) of 550,000 hours."

I can find no indication that Tandberg Data (now merged with Overland Storage] has actually ever custom-manufactured its own HDDs. I therefore infer that at least part of the shock-proof capabilities of RDX cartridges are due to the inherent shock-proof capabilities of high-quality 2.5-inch HDDs before they are inserted into RDX cartridges. The inferred consequence is that, if a portable HDD is constructed from one of these high-quality laptop HDDs, and has been properly ejected from a computer so that its heads are locked, it's not nearly as subject to transport damage as the section's "Hard disk" item says. As I said in the second paragraph of my "16:56, 18 April 2018 (UTC)" comment, at one point I found a ref from one of the major HDD manufacturers that stated that; if I can find it again I'll change the item accordingly without asking permission. DovidBenAvraham (talk) 04:31, 21 April 2018 (UTC)

The added ref [3] in my "04:55, 9 June 2018" edit of the Hard disk" paragraph that you reverted, JohnInDC, is to "Timo"'s YouTube video of an experiment. You remember experiments, don't you, like your junior high school general science teacher used to do in class? And by my junior high school standards, it's a pretty transparent experiment. I could have ref'd another YouTube video in which the experimenter tosses a Samsung portable hard drive high over his shoulder, drops it several times from waist height onto a plastic-tiled office floor, and has several people step onto it. However the only demonstration that the drive still works is at the very end of that second video, and besides the second video is narrated in Russian.

I was very circumspect in my use of that "Timo" ref; I inserted it after the word "somewhat" in "somewhat more easily damaged". In my view it is even more reliable than the 2016 Wirecutter review you suggested I use (which has been revised in 2018 so that the results of their tests are summarized in a single paragraph that is the second-to-last in the article), because the Wirecutter reviewers only describe their "rugged drive" tests instead of show them on-camera. And remember that the Wirecutter article revealed that the MIL-STD tests the manufacturers of the "rugged" hard drives reported having passed were invalid (apparently because that particular MIL-STD has wide leeway). So in my view seeing tests performed is fundamentally more reliable than reports of tests performed, no matter how "reliable" the reporting authors are.

In short, I believe the "YouTube not an RS" you invoke in your reversion is not applicable to this particular link. Try reading Wikipedia:Video links; you'll find that it is mostly concerned with copyright violations. That article in turns links to Wikipedia:Identifying reliable sources, but that article in turn says "making sure that all majority and significant minority views that have appeared in those sources are covered". How can there be such views in the video of an experiment? And that experiment is not OR on my part, because "Timo" did the experiment and simply made a video of it.

Unless you can cite a very good WP reference as to why my preceding paragraph is not applicable, I intend to revert your reversion. If you wish we can get a third opinion, but I'm pretty sure it will be in my favor.DovidBenAvraham (talk) 22:43, 9 June 2018 (UTC)

Note further that the hard drive "Timo" is using for his 2016 experiment is not a portable HDD. It is instead visibly (mechanism on the bottom) an internal HDD, a Seagate I think. The mail.ru 2009 demonstration [4] is of a Samsung portable HDD, but a commenter suggested that "maybe it's empty inside during the drop test". Remember that the whole idea of my enhancements to that WP article paragraph is to show that modern portable hard drives are shock-proof enough that you're not absolutely crazy to rely on them for off-site backups with a longevity of up to a couple of years. DovidBenAvraham (talk) 11:55, 10 June 2018 (UTC)
Another thing that raises doubts in my mind about the mail.ru 2009 demonstration is its sound track. Up until the final remarks in Russian, it consists entirely—except for background sounds of drive abuse —of a penny-whistle rendition of the "Imperial March" from The Empire Strikes Back. Does that mean that the demonstrator is behaving like Darth Vader to the portable HDD, or to the viewer? DovidBenAvraham (talk) 23:57, 19 June 2018 (UTC)
A very good counter-argument to JohnInDC's statement about YouTube videos has been staring me in the face for weeks. Despairing of administrator users who don't search the long User's Guides, the head of Retrospect Technical Support started in 2012 making short video Tutorials of the use of popular features. Are these videos hosted on Retrospect.com? Of course not; links to them are hosted on Retrospect.com, but the videos themselves are hosted on YouTube. Have I used them as references in this article and the Retrospect article? Yes in one case, because Retrospect Inc. stupidly never put a 3-sentence explanation of those features into the User's Guides. Do I like to use them as references? No, because I'm a 20th-century person who'd rather re-read a text explanation than re-play a video. Are they reliable? IMHO more than any text first-party reference, where they actually demonstrate the use of software features. DovidBenAvraham (talk) 12:04, 28 June 2018 (UTC)
Except it's not a Retrospect video. It's Some Guy named "Timo". Unless Timo is himself an RS, then the video isn't either. JohnInDC (talk) 12:57, 28 June 2018 (UTC)

And, at last, this [5] seems to be the 2018 manufacturer's page I've been looking for to prove my point about portable hard drives—at least from Toshiba. See the next-to-last sentence in the first paragraph of its "Overview" section. DovidBenAvraham (talk) 19:34, 10 June 2018 (UTC)

After waiting in vain a week for JohnInDC to produce the "very good WP reference" showing that "Timo"'s video of his experiment is not a "reliable source" simply because it's on YouTube, I've put it back into the "hard drive" paragraph as a ref. I've also added the 2018 Toshiba Singapore page as a ref; until some reviewer publishes the results of testing the transport vulnerability of a Canvio 3.0 or some other portable HDD, that ref should serve to support my point about the vulnerability being recently reduced even if JohnInDC re-deletes the "Timo" ref. I've also deleted the entire clause about a rugged enclosure, with its link to RDX Technology, because my deletion of the Wirecutter ref—whose 2018 article revision removed the sentences I had quoted—leaves it without support. DovidBenAvraham (talk) 02:28, 16 June 2018 (UTC)

A look at "Timo"'s follow-on video, in which he destroys a similar internal HDD with a hammer, indicates his video that I had used as a ref wasn't a sufficiently-powerful demonstration. So JohnInDC's deletion of that ref isn't really such a great loss. That's especially true because I have now discovered a 2007 HGST whitepaper that definitively describes their power-loss and ramp load/unload technology and its benefits to their portable HDDs. Since I put that in as a ref JohnInDC has deleted the entire quote in it as "buying advice". However the only "buying advice" in that quote was "The 2.5-inch Travelstar drives are the most rugged mobile drives in the industry." I'm going to put the quote back in with that sentence deleted; the rest of the quote either explains the technology or states that it enables non-operating shock resistance (the kind you want during transport) far in excess of the 350Gs resulting from a 3-foot drop onto a hard floor (and yes, if JohnInDC wants I can re-find a calculated ref for that 350Gs figure). DovidBenAvraham (talk) 02:00, 29 June 2018 (UTC)

I have further pruned the HGST whitepaper quote of drive model names that JohnInDC would consider "buying advice". DovidBenAvraham (talk) 08:48, 29 June 2018 (UTC)
There are too many quotes in the references generally, which do not improve on or further clarify the unadorned reference. They are cluttery, confusing and excessive. We don’t need to both paraphrase or condense a point from the source, and then quote the source verbatim to say the same thing. I’ll be reviewing all the quotations against this measure. JohnInDC (talk) 10:38, 29 June 2018 (UTC)
I am grieved to see that JohnInDC is following in the steps of scope_creep; he is now Making Up WP Rules, apparently because of an inability to read English at the appropriate level. The rule Wikipedia:Quotations#Overuse that JohnInDC refers to in the summary of his "11:01, 29 June 2018" article edit is clearly restricted to quotations used directly in articles, not to quotations in such templates as Cite Web. While a reader is forced to read a quotation placed directly in an article, he/she is not forced to read a quotation in a reference unless he/she mouses over the citation's bracketed superscript.
The four references whose quotations JohnInDC has edited out are all to Web articles containing neither page numbers nor clickable section headings. Therefore it would be difficult for anyone wishing to verify the citations to find the pertinent passages in the references. I thought it would be appropriate to provide quotations of those passages, which could be browser-searched for verification. If I erred by making the quotations complete enough to include embedded numeric information—which would save time during a casual verification, please excuse me.
What I will now do is to put those quotations back into the Cite Web template occurrences, but pared down to exclude any embedded numeric information or "marketing speak". The Forbes article is cited twice, once for "tape" and once for "hard disk", so to avoid two separate references for the two cites I will include pared-down quotes appropriate to both citations. The Iron Mountain article quote included numeric information on Mean Time Between Failures—which was not "sufficiently captured by the text"—for the two types of media, but I will pare out those numbers despite the loss to readers.
If JohnInDC still objects to the pared-down quotations, I will file a request for a Wikipedia:Third opinion—as I did with my dispute with scope_creep as related in my "13:41, 13 October 2017 (UTC)" comment in this section of the Talk page for another article. This dispute is just between him and me, so there won't be any "third-person involved" copout this time. DovidBenAvraham (talk) 01:04, 1 July 2018 (UTC)
First, from Wikipedia:Quotations#Overuse: "Longer quotations may be hidden in the reference as a WP:FOOTNOTE to facilitate verification by other editors without sacrificing readability. Verification is necessary when a topic is controversial." These are not controversial claims; the redundant quotes are unnecessary to that (or any other) end. Second. Other than the Western Digital white paper, the linked references are short - in one instance a single screen. The redundant quotes are not necessary to help the reader locate the relevant passage. Third, in three instances the quotes add nothing of substance to the article text. They are entirely superfluous. The quotes for the three short articles are cluttery and add no value. I intend to remove them again. I'll look harder at the Iron Mountain reference for additional information but may remove that too. Fourth. Please avoid personal attacks ("because of an inability to read English at the appropriate level"); and assume good faith ("so there won't be any 'third-person involved' copout this time"). And finally, feel free to seek a third opinion. Thanks. JohnInDC (talk) 01:21, 1 July 2018 (UTC)
First, I apologize for saying "an inability to read English at the appropriate level"; what I should have said is "an unwillingness to take the time to read English at the appropriate level and to think about what is being read". I think that is factual rather than a personal attack, because that is IMHO what JohnInDC has exhibited at the beginning of his "01:21, 1 July 2018 (UTC)" comment. The claims will be controversial for most readers, because they go against persistent folklore. And folklore can be a powerful influence on many people, even when it is based on unfounded speculation or outright fraud. That's why IMHO readers should be able to see the quotes with one mouse-over, rather than two mouse-overs per JohnInDC's suggestion for hiding them as WP:REFERENCEs within the footnotes.
In fact the "drop [your hard drive] on a hard floor as you pull it out of the safety deposit box, and like that, you’re off to the recovery service" was probably still true for some brands as late as 2007; that's why I put it in the quote for the PC World reference for the statement that was already in the article as of 2011. But I think it was only folklore by the early 2016 date of publication, because by then all the other manufacturers of portable drives had licensed the HGST-patented "fault-tolerant retract system"—included in the quote for the Western Digital-published whitepaper. Did the other manufacturers announce that they had licensed the patent? No, they just quietly changed their warranties to cover damage from a drop that doesn't exceed 350Gs of deceleration—which is the deceleration you get from dropping a portable drive from chest height onto a hard floor. That's why I had to include the quote in the Toshiba reference.
That Toshiba reference, BTW, is the only one of the three that is a single page. I put in the quote because its applicable sentence fragment would otherwise get lost in what designers would call a "busy" article. The Forbes article is 4 screen pages in Firefox on my 2010 Apple monitor, which has a vertical resolution of 1440 pixels. The PC World article is 9 screen pages.
The sentence in "Magnetic tape" that cites the Forbes reference is as close as I could come to stating that LTO is the only "super" format that is still under development. The last two sentences in the WP article that covers the T1000 format say that as of 2016 Oracle has stopped manufacturing tape drives, but don't give any reference—which if it existed I could use. IBM was a co-developer of the LTO format, and now seems to be concentrating only on selling 3592 cartridges to customers who already have tape drives that are compatible with them. IMHO the quote in the Forbes reference is needed to dispel the folklore that "There have been a variety of formats ...", which used to state "There are a variety..." before I changed it.
IMHO anyone reading the subsection should be made aware of the number-of-years difference between the long-term stability of magnetic tape and hard disk storage, which is alluded to in the Forbes and PC World quotes JohnInDC has removed. DovidBenAvraham (talk) 11:16, 1 July 2018 (UTC)
Looking further at the suggestion in JohnInDC's "01:21, 1 July 2018 (UTC)" comment, I see that he is suggesting using the procedure described in Help:Footnotes#Footnotes: embedding references. I see two problems with that: (1) I'd have to reverse the paring of each quote so that it would make sense at the beginning of a footnote. This would make the quote more prominent than the reference I had put it into, which IMHO is the exact opposite of what both JohnInDC and I want. (2) I'd have to put a full "cite web", or its full equivalent—remember that this is supposed to produce a full reference, inside the outer "refn" (sorry, I don't have time to figure out how to describe this in the proper WP syntax documentation way). Frankly, I don't think that is possible. In short, IMHO JohnInDC didn't think out the implications of what he wrote in the first paragraph of his "01:21, 1 July 2018 (UTC)" comment. DovidBenAvraham (talk) 18:47, 1 July 2018 (UTC)

Third opinion

barkeep49 (talk · contribs) wants to offer a third opinion. To assist with the process, editors are requested to summarize the dispute in a short sentence below. Please try to keep statments brief. I have read the above but there's a lot there.

Viewpoint by DovidBenAvraham

I started editing the ["Storage media" section] of the article on 10 April, after I found a couple of statements there that seemed to be un-referenced "community folklore" from 2007. The first statement was "There are many [magnetic tape] formats, many of which are proprietary or specific to certain markets like mainframes ....". The second statement was "The main disadvantages of hard disk backups are that they are easily damaged, especially while being transported (e.g., for off-site backups), and that their stability over periods of years is a relative unknown."

These statements are no longer true, so I have carefully revised them. JohnInDC has been pressing me for nearly a year to adopt The Encyclopedic Way, which is to wait for a Reliable Source to write a neatly-organized review-type article and then reference that. But no RS has written such articles (I've searched), because articles saying "Tape drives for 'super' formats other than LTO are no longer being manufactured" and "Hard disk drives being transported are no longer so easily damaged, because manufacturers have put ramp loading and accelerometer-equivalent technology into portable drives" unfortunately won't "sell papers". Therefore I have had to use references for my revised statements that contain sentence fragments of fact within either 2 long articles or one "busy" one-page marketing article.

My revised statements will be controversial for most readers, because they go against persistent folklore. (I have even put in one 2016 reference, IIRC suggested by JohnInDC, for a statement that I left in expressing the old hard disk folklore.) That's why IMHO readers should be able to see pared-down quotes with one mouse-over, rather than two mouse-overs—per JohnInDC's suggestion for hiding them as WP:REFERENCEs within the footnotes. The readers can then use the pared-down quotes as browser search arguments to find the nuggets of fact within the references. DovidBenAvraham (talk) 01:36, 2 July 2018 (UTC)

Viewpoint by JohnInDC

Speaking first - DBA may have a different view but I think the issue is simply whether certain direct quotations included in reference footnotes are helpful and / or necessary; or instead superfluous, confusing and cluttery. IMHO they add nothing to what's already in the text, and are not necessary to assist the reader in finding material in what are quite short articles. It's belt-and-suspenders, and as a matter of style and clarity the article is (in a small way) better without them. Here's a diff. There is indeed a lot of Talk above, but the pertinent discussion begins, I think, with my entry about 12 paragraphs up that begins, "There are too many quotes in the references generally..." JohnInDC (talk) 20:17, 1 July 2018 (UTC)

Third opinion by barkeep49
3O Response: The quotes do not add value to the reader Best, Barkeep49 (talk) 03:35, 2 July 2018 (UTC)

Thanks JohnInDC and DovidBenAvraham. There doesn't seem to be any policy or guidelines, at least that I can find, which prohibits the use of the quotes in the references. As such this purely a content matter. I have seen this format used in FAs & GAs but generally for sources which are either long or not accessible online. None of the sources in this section are particularly inaccessible and simple searching (or eve just scanning) in all of them readily turned up the paraphrased information. As currently constructed the two quotes are not useful. The first quote is very long; and the content that is most applicable is Ramp load/unload technology greatly minimizes the effects of shock damage by safeguarding against head/disk contacts. which doesn't actually say more than the paraphrase. The second quote actively deleted the years difference between HDD (7) and tape (46) rendering it useless as a myth buster. It is my (third) opinion that the reader doesn't really benefit from either quote's inclusion any more than if they were to click through. I will make two observations:

  1. The citations in this section are generally below a standard that I would expect when doing a GA review and fall short of truly honoring WP:RS. There are issues with reliability (Forbes contributor) and secondary (Toshiba, WD white paper).
  2. DovidBenAvraham your apology for the incivil remarks was not an apology and the revised statement was still incivil. It looks like you two will have to keep collaborating together and I'd encourage you to focus on content not your fellow editor.

I will keep this page watch listed for a few days if there are any questions I can answer or clarifications I can make. Best, Barkeep49 (talk) 03:35, 2 July 2018 (UTC)

Thanks for your time and the help. JohnInDC (talk) 10:45, 2 July 2018 (UTC)
Thanks, Barkeep49, for taking the time to show me that "simple searching (or eve [sic] just scanning) in all of them readily turned up the paraphrased information." I guess I had a pessimistic idea of the search abilities of an average reader unacquainted with an article's topic. I would, however, like to comment on your two observations:
  1. A click on the Full Bio information [you may have to click on Tom Coughlin's photo first] in the Forbes article shows that Tom Coughlin has credentials appropriate to his being a Forbes contributor; perhaps you have confused him with the sports figure of the same name. Here's a publication in an industry-society journal by him. Given the business Iron Mountain is in, IMHO that is the reference whose reliability as to the lifetime of tape (a bulky medium which that company stores) vs. hard disk (a compact medium which can be stored in a bank safe deposit box) you should question. As to the lack of references to secondary sources, I have already explained the regrettable reason for that in the second paragraph of my Viewpoint.
  2. As to the incivility in the first paragraph of my "11:16, 1 July 2018 (UTC)" comment, my "18:47, 1 July 2018 (UTC)" comment explains why I feel that the suggestion in the first paragraph of JohnInDC's "01:21, 1 July 2018 (UTC)" comment was not up to his usual high standard—although I realize now that his suggestion was intended to be helpful. DovidBenAvraham (talk) 09:18, 3 July 2018 (UTC)
@DovidBenAvraham: Thanks for your reply. I will admit that I saw Forbes Contributor and that was the extent of my checking so I didn't even pay attention to his name (and am enough of a sports fan to know on sight what football Tom Coughlin looks like). I agree upon a fuller look that my issues with that article as RS are unfounded. And I'm also glad to know that your comments about John was momentary frustration rather than a tipping point into something else. Best, Barkeep49 (talk) 14:46, 3 July 2018 (UTC)

I just noticed the Western Digital whitepaper says the HGST fault-tolerant retract system is patented and used in any drive incorporating load/unload tech. I've added that sentence to the quote, because it means all portable drives park their heads on a ramp even if powered-off inadvertently. The very important implication is that if all HGST portable 2.5-inch drives can withstand 1,000 Gs non-operating shock, then so can those from any leading manufacturer. If JohnInDC wants to scrap the entire quote, Barkeep49's 3O says he's justified in doing so; however if the quote stays in, IMHO so should that additional sentence.

The key question I'm still trying to find the answer to is whether, if you drop a non-"rugged" portable backup HDD from about waist height onto a hard floor, its data will still be OK. I've learned from a Tom's Hardware forums post that the problem can be approached as follows: If you drop a portable HDD from a 1-meter height into a 1-meter-deep lemon meringue pie (excessive but yummy) and the drive stops falling just above the piecrust, then the HDD mechanism has sustained 1Gs deceleration force. If you drop a portable HDD from a 1-meter height into a 1-centimeter-deep bowl of Jello (pretty chintzy) and the drive stops falling just above the bowl's bottom, then the HDD mechanism has sustained 100Gs deceleration force. If you drop a portable HDD from a 1-meter height onto a hard floor and the drive casing flexes 1 millimeter, then the HDD mechanism has sustained 1000Gs deceleration force. As a non-engineer, I'm not sure whether a non-"rugged" portable HDD's mechanism flexes as much as a millimeter when dropped. (Yesterday I spoke to a tech support person at OWC, who suggested that I ask the question of Randall Munroe—the author of the webcomic XKCD; I see he has a blog What If?.)

If the drive casing doesn't flex that much, then I should change the next-to-last sentence in the "Hard disk" to emphasize using "rugged" HDDs. Those essentially wrap the portable HDD in the equivalent of at least 1 centimeter of Jello by adding bumpers. The 2017 version of The Wirecutter article, which I used to use as a reference, contained a paragraph of evidence that 2 models of "rugged" portable HDD survived at least 16 1-meter drops during testing. That paragraph was deleted—apparently for reasons of brevity—when the Wirecutter article was revised in place in June 2018, so I deleted the reference and this paragraph's mention of "rugged" HDDs. Maybe I could use the Wayback Machine for a reference, and put the mention of "rugged" portable HDDs back in. DovidBenAvraham (talk) 07:00, 4 July 2018 (UTC)

It's a quote from a WD white paper extolling the unique ("patented"!) virtues of WD drives - and reads like someone trying to sell WD drives. Not that you are - but Wikipedia shouldn't sound that way. I do not agree with the addition, nor do I think that this quote adds anything to the text, which (paraphrased) simply says that the adoption of ramp technology seems to have made drives less vulnerable. That's precisely what the quotation says - again, redundant and superfluous. It's like the others and should be removed altogether. JohnInDC (talk) 10:05, 4 July 2018 (UTC)
JohnInDC, if you think "patented" means "unique", I think you—along with many other Americans—need to acquaint yourself with the concept of patent licensing. The sentence I added to the quote said "In February, 2000, HGST was awarded the patent to this invention (US 6,025,968), which is used in all HGST drives and any hard drive incorporating load/unload technology". Since the Toshiba reference says they are using ramp loading, and since a patent awarded in February 2000 is still in force until February 2020, either Toshiba is licensing the patent US 6,025,968 from Western Digital or they are violating U. S. patent law. Leaving in the quotes for both the Western Digital and Toshiba references would certainly make it easier for a reader to conclude that a powered-off portable HDD from any major manufacturer automatically has its heads safely parked on the ramp, which is the specific less-vulnerable message I want to convey. While the original quote did tend to extol the virtues of WD drives, I had pared it down before my latest addition to eliminate "market speak". The added sentence doesn't need to be pared down, since there are a lot of WP articles that say technology X was invented by company Y—which no other WP editor (except possibly scope_creep) thinks is "market speak". In view of Barkeep49's 3O I'm not going to get into a revert war with you, but I think pared-down version of the two quotes would give a reader something that would otherwise require parallel scanning of the two references. DovidBenAvraham (talk) 13:49, 4 July 2018 (UTC)
It's WD tooting its own R&D chops, repeated in the FN of a Wikipedia article, and is troublesome whether it indirectly promotes a WD product or a third party product featuring WD's licensed (patented!) technology. And the quote - like all the others - merely expands on the existing paraphrase in the article. And it has a page number associated with it so people can find it easily. And again, as I've observed many times, we are not here to provide buying advice - explicitly or implicitly - to people who may be thinking about buying hard drives and who may find it useful to know what particular drive-head technology may be included in one or another product of major drive manufacturers. Wikipedia isn't here for prospective purchasers, or for that purpose. Finally, I'm tired of your condescension. You may have concluded that I'm dimwitted and poorly educated, but please try to keep it out of your commentary. Thanks. JohnInDC (talk) 14:43, 4 July 2018 (UTC)
I've followed JohnInDC's approach in converting the Western Digital quote into a "page=" parameter with a brief annotation in parentheses, and done the same for the Forbes and PCWorld and Toshiba former and Iron Mountain current quotes. To do so I've used "cite web"'s ability, which may not be well known, of treating a section name with no embedded spaces preceded by a hash mark as a page "number". This guides a reader to the proper general place in a multi-page reference article, and brief annotations—there can be multiple ones separated by commas—wrapped in a pair of parentheses following the "page number" provide finer-grained detail as to what the reader should look at. Neither the "page numbers" nor the annotations need pollute the reader's mind with "buying advice" or "market speak". I think this is a fine compromise in the spirit of Barkeep49's 3O.
I haven't concluded that JohnInDC is dimwitted or poorly educated; quite the contrary. I'm rather envious of the style and speed with which he writes English. I think the problem is that the two of us are fitted for different roles in the Wikipedia universe. I'm rather pedantic, and very concerned to get the technical details right in the few articles I edit. JohnInDC, OTOH, seems to switch rapidly between many articles—making sure that the editors of those articles don't commit common WP or stylistic sins in their pedantry. My impression is that JohnInDC sometimes is too rushed to think things through thoroughly, as in his "01:21, 1 July 2018" suggestion. After further thought he frequently more than redeems himself, as in his latest editing of the Western Digital reference. I'll try to control my temporary annoyance and not sound condescending in the future, as we continue to work together. DovidBenAvraham (talk) 20:13, 4 July 2018 (UTC)
Thanks. JohnInDC (talk) 21:36, 4 July 2018 (UTC)
An experiment reveals that "cite web"'s ability to treat a section name preceded by a hash mark as a page "number" does not require that the section name be space-free, as I had previously assumed! This means that a reader wanting to check the details of a reference that may seem controversial need only open a second copy of the article, and then copy-paste the section name—as seen in the reference contained in the second copy—into the search field of a browser viewing the same reference in the first copy of the article. Since the section name can contain spaces, the reader is not required to either insert spaces or convert underscores into spaces after copy-pasting the section name into the browser search field. This is a really stunning WP feature; kudos to whoever put the feature into "cite web"! Obviously I've now put embedded spaces back into the reference section names of cites I inserted into this section of the article. DovidBenAvraham (talk) 07:03, 5 July 2018 (UTC)
Upon looking at Template:Cite web#In-source locations and experimenting, I found out the hash mark isn't necessary. The truth is that the code interpreting a "cite web", after it has been inserted using the GUI template, does so as if the parameter is "at=" instead of "page=" or "pages=". However if the parameter is actually "page=" or "pages=", the interpreting code also displays a "p." or "pp." as appropriate—which it isn't if the ref is to an un-numbered Web page. Unfortunately the "cite web" GUI template doesn't have a field allotted for "at=", so you have to make the change to "at=" yourself after the template is inserted. I replaced every "page=" with "at=" in refs for the "Magnetic tape" and "Hard disk" prgfs., except for the Western Digital ref which actually has page numbering. I also inserted "sec." or "para." where appropriate in the refs, since the WP article section linked to in the first sentence of this comment wants us to use those clarifying abbreviations. DovidBenAvraham (talk) 20:42, 6 July 2018 (UTC)
My friend started a thread on the Ars Technica Other Hardware forum to ask about how much the casing flexes when a non-"rugged" portable HDD is dropped from 1 meter onto a hard floor. Someone found a 2016 Product Manual for a Seagate Samsung lasptop internal HDD, it said "The nonoperating shock level that the drive can experience without incurring physical damage or degradation in performance when subsequently put into operation is 650 Gs based on a nonrepetitive half-sine shock pulse of 2 ms duration." The same person also found an "Acceleration levels of dropped objects" PDF with a table that said the estimated peak acceleration level for a drop height of 36 in. and a pulse width of 2 ms. duration is only 340 Gs. That is well within the non-operating shock level of a Seagate Samsung drive. Even with a pulse width of 1 ms. duration the estimated peak acceleration level is 680 Gs, which is slightly more than the Samsung maximum but considerably less than the HGST maximum shown in the reference I cited in the article's "Hard disk" paragraph. I think the two references taken together show that, even without considering flexing of a non-"rugged" case, it is correct to say that "the transport vulnerability has been reduced". Would that be considered WP:OR and, even if it wouldn't be, would adding those two references really improve on the two references that are already there? DovidBenAvraham (talk) 01:50, 9 July 2018 (UTC)
That sounds like textbook Synthesis - that is to say, combining two sources to support an implied conclusion that neither of them states. They can't be used for any unstated conclusion, and I don't see how they'd improve the article instead of confusing the reader who - at best - is being invited to perform the Synthesis themselves. JohnInDC (talk) 22:51, 9 July 2018 (UTC)
Thanks, JohnInDC, I was afraid you'd say that—which is why I asked. On the Ars Technica thread my friend started, another poster replied "Manufacturers don't publicize this data [maximum non-operating shock specs] because of litigation. You claim a drive can survive a 1 foot drop and they'll start getting RMA claims that the drives broke because they didn't survive a 1 foot drop." This concept, which was too worldly to occur to me or my friend, explains why we couldn't find references containing the maximum safe drop distance for a non-"rugged" 2.5-inch portable HDD. So I'll just have to be satisfied with the personal certainty that the sentence I added to the "Hard disk" paragraph in the article is correct. DovidBenAvraham (talk) 22:13, 10 July 2018‎ (UTC)
I don't after all just have to be satisfied with the personal certainty, because it turns out that in 2010 EMC Iomega was almost as unworldly as my friend and I are. They published this English-language article with a French-language URL. However their unworldliness had practical limits; see "*NOTE" at the very bottom of the article. Nevertheless, since the "What is Drop Shock Technology?" sub-section gives baseline shock tolerance specifications for portable HDD Drop Shock Technology—including actual drop test numbers for non-operating shock, I have added it as a ref for the "... has been reduced" sentence in the "Hard disk" paragraph of the WP article. BTW I found the EMC Iomega article by first finding a 2012 The Register review of portable HDDs that mentioned the term "Drop Shock Technology". DovidBenAvraham (talk) 00:35, 13 July 2018 (UTC)
The "What is Drop Guard Technology?" sub-section of the EMC Iomega ref nails an actual average shock tolerance drop test distance for the entire industry! It says that Drop Guard Technology, which adds special internal cushioning, puts its shock tolerance specification "40% above the industry average for portable hard drives". That means by simple calculation, which can't be WP:Synthesis since it is entirely based on a single reference, 51 inches ≈ 1.4 * 36 inches—which must therefore be the industry average. Obviously I've added the quote, including the calculation inside brackets, to the ref. And, alluding back to the Third Opinion, I confidently maintain that the reader does really benefit from this quote's inclusion—particularly because of the bracketed calculation. I invite JohnInDC or Barkeep49 to refute that; "le jour de gloire est arrivé". DovidBenAvraham (talk) 08:01, 14 July 2018 (UTC)
The source article is about a screen page long, the quote is easily found, the ref is one of several examples in the text of a general statement - this detail, with editor-contributed material, isn't necessary for any reason. JohnInDC (talk) 15:23, 14 July 2018 (UTC)
The Iomega .PDF article is in fact 3 screen pages long; its writer put page numbers at the bottom of each page, so its Firefox pages aren't just too tall for my 27-inch monitor. Having belatedly noticed the page numbers myself, I've changed the ref's "at=" parameter back to a "page=" parameter—and put in (in parentheses after the page number) just enough sub-section information to make the quote "easily found". The reader will now have to remember enough elementary algebra to formulate and solve "(100% + 40%) x = 1.40x = 51 inches; find x"; the deleted "editor-contributed material" merely did that algebra on behalf of readers whose math skills aren't still as good as those of JohnInDC.
I think JohnInDC still fails to appreciate that the purpose of the second-to-last sentence of the WP article "hard disk" paragraph is to answer a likely reader's question: "Does the 2016 PCWorld ref in the preceding sentence mean that I have to buy an expensive won't-fit-in-my-pocket 'rugged' HDD to use for my off-site backup?" IMHO the hard-to-find (see my "22:13, 10 July 2018 (UTC)" comment) industry specification calculable from the Iomega article says "Not as of 2010, so long as you take care not to drop your non-'rugged' portable HDD onto the hard bank safety deposit room floor as you transfer it between the 30-inch-high table your box is sitting on and your 36-inch-high pocket." The Iomega article says the industry-standard drop test is done onto a floor with industrial carpeting, which is no softer than than what you are likely to drop your portable HDD onto almost any place else. If he thinks about it, he will also realize that it would have been in Iomega's interest to understate the industry average, so if the answer to the algebra problem is "x = 36 inches" we can believe it even though Iomega is a manufacturer. DovidBenAvraham (talk) 17:33, 14 July 2018 (UTC)
In keeping with my tradition of occasionally being heavy-handed on this Talk page, I will now explain why I considered the "editor-contributed material"—a simple bracketed calculation result in a quote of the industry average shock tolerance shock tolerance for portable HDDs—to be necessary. If you actually listen to statements made by public figures, especially by politicians such as Old Golden Boy, you will notice that they nowadays say "Y is six times more than X" when they really mean "Y is six times as much as X". They also say "Y is one-third less than X" when they really mean "Y is two-thirds as much as X". That's because these public figures realize that a good percentage of American adults don't understand 8th-grade math. (That's a non-Old-Golden-Boy explanation of why manufacturing jobs have left the U.S. for countries such as China. Remember that an important reason why Japanese products gained a superior reputation for quality from the 1960s onward is that Japanese managers were able to to teach statistical quality control techniques developed by the American W. Edwards Deming to their high-school-graduate employees.) A reader who doesn't understand 8th-grade math will read "... at a height of 51 inches onto industrial carpeting. This shock tolerance specification is 40% above the industry average for portable hard drives" in the Toshiba ref, and will then formulate the problem as "x = (100% - 40%) * 51 inches = 0.60 * 51 inches; find x". That formulation gives "x = the industry average for portable HDDs = 31 inches", which is the wrong answer (it should be 36 inches) because the problem formulation was wrong (per the first paragraph of my "17:33, 14 July 2018 (UTC)" comment)—a result of the reader's not really understanding "40% above". And that's why I considered the simple bracketed calculation result in a quote of the industry average shock tolerance for portable HDDs to be necessary for many American readers. The difference between 31 inches and 36 inches is the difference between standard desk/table height and—for me—pocket height. DovidBenAvraham (talk) 06:57, 15 July 2018 (UTC)
My compromise solution was to put the calculation of the Iomega-article-asserted 2010 industry average shock tolerance specification for portable hard drives into a footnote at the end of the sentence where it is cited. The footnote also says that the drop testing—onto industrial carpeting—described in that article is less stringent than that—onto a hard floor—posited in the 2016 PCWorld article cited in the preceding sentence. DovidBenAvraham (talk) 05:42, 17 July 2018 (UTC)
Please stop adding your own calculations and conclusions to these articles. It's WP:OR. Anyone who is interested at the level of detail you're providing is fully capable of reading everyone one of these sources and making their own purchase decisions based on their own math. Not to mention that Wikipedia articles aren't written in order to facilitate buying decisions of its readers. Thanks. JohnInDC (talk) 12:26, 17 July 2018 (UTC)
First, the calculation in the footnote that JohnInDC has deleted was not WP:OR. The second paragraph of Wikipedia:Calculations#Routine calculations says "Routine calculations do not count as original research. Basic arithmetic, such as adding numbers, converting units, or calculating a person's age, is allowed provided there is consensus among editors that the calculation is an obvious, correct, and meaningful reflection of the sources." I could easily recast the equation and solution in the footnote as "36 inches (51 inches ÷ 140%)"; that is thoroughly in keeping with the "Appended" method in the Wikipedia:Calculations#Routine calculations 2 section of that same Wikipedia:Calculations#Routine calculations article, and is a correct and meaningful—if non-obvious to the mathematically-challenged—reflection of the Iomega reference's text.
Second, before saying I want to "facilitate buying decisions of [the article's] readers", I suggest that JohnInDC take several deep breaths, and then really read the entire "Storage media" section of the article, while asking himself the question "Why does each of these paragraphs contain so many sentences?" He will quickly realize that the third or fourth and following sentences in each paragraph are intended to educate a reader who needs to decide which type of storage medianot which particular brand— to "buy into". The footnote that he deleted was simply intended to present the numerical industry-average non-operating shock tolerance specification for portable HDDs, and to contrast the test method for that specification with the more stringent test method proposed in the PCWorld article. I only mentioned the name "What is Drop Guard Technology?" to enable a reader to find where in the Iomega reference the numerical specification is spelled out in words. I wasn't trying to get the reader to buy a portable HDD with Drop Guard Technology; in fact AFAICT Iomega no longer manufactures HDDs with that technology (which was an attempt to produce a "rugged" portable HDD that wasn't as fat as the ones with bumpers—of which Drop Guard Xtreme was a no-longer-manufactured example). DovidBenAvraham (talk) 04:09, 18 July 2018 (UTC)
The articles aren't for any kind of buying advice. They are to inform readers about the general subject matter, using material drawn from reliable sources that have written about the matter. Your addition - here, for convenience:
The information in the "What is Drop Guard Technology?" sub-section of the Iomega article cited in this sentence is sufficient to formulate the algebra problem "(100% + 40%) x = 1.40x = 51 inches; find x". Its solution, "x = 36 inches", is thus the asserted 2010 industry average shock tolerance specification for portable hard drives. The sub-section says those specifications are for drop testing onto industrial carpeting, which is less stringent than the drop onto a hard floor posited in the "External hard drives" section of the 2016 PCWorld article cited in the preceding sentence.
- isn't some simple math calculation but a collection of inferences and personal observations designed, as you yourself note, to assist in purchasing decisions by readers. The fact that there's an embedded basic calculation doesn't salvage it. Further, the footnote is incoherent to anyone who doesn't delve into the reference, and, anyone who does delve into it will find this information right at hand. I repeat. This article is about backups, and backup technologies generally, and isn't here as a handy "Best Technologies for Backups!" site for interested systems administrators. Please stop trying to make it that. And, again, please stop the cutesy little personal asides like "take a deep breath", trying to paint me as hysterical or unhinged. They annoy me, and don't impress or persuade any third party unfortunate enough to be following these exchanges. JohnInDC (talk) 13:18, 18 July 2018 (UTC)
Here's a more-coherent rewrite of the footnote that should meet JohnInDC's WP-rules-based standards:
The third manufacturer reference cited in this sentence states that the 2010 industry average transport shock tolerance specification is drop-testing at a height of 36 inches (51 inches ÷ 140%) onto industrial carpeting. That contrasts with "You can do everything right with your drive, but drop it on a hard floor [emphasis added] as you pull it out of the safety deposit box, and like that, you’re off to the recovery service", which is the potential transport vulnerability in the PCWorld reference cited by the preceding sentence.
The first sentence in the rewrite is a routine calculation entirely from the text of the reference, which—as I have pointed out in the first paragraph of my "04:09, 18 July 2018 (UTC)" comment—does not make it an inference according to WP rules. The second sentence in the rewrite highlights an apparent Conflict between sources, but I have not made a personal observation—merely complied with "If equally reliable sources disagree, present all of the information" (per the second bulleted paragraph in the quoted section of the WP rule) accompanied by pointing out that the apparent disagreement results from two different definitions of drop testing. Note that I have made the first sentence of the rewrite klunkier by avoiding mention of the name Iomega as the author of the third reference; that should make scope_creep happy.
As for "Wikipedia articles aren't written in order to facilitate buying decisions of its readers", IMHO that sentence from JohnInDC's "12:26, 17 July 2018 (UTC)" comment is fooling no one but himself as far as this article is concerned. He should look at the last sentence—and sometimes the last two sentences—of the "Magnetic tape" and "Optical storage" and "Solid state storage" and "Remote backup service" paragraphs of this "Storage media" section of the article. Every one of those sentences is stating an advantage or a drawback of its particular type of storage media, and I didn't write those sentences. I split the former next-to-last sentence of the "Hard disk" paragraph because its existing "buying advice" was at least partially obsolete, but my new next-to-last sentence—including the rewritten footnote—just presents the same kind of information that the equivalent sentences in the other paragraphs also present. DovidBenAvraham (talk) 01:23, 22 July 2018 (UTC)
The text already makes the point clearly and concisely, in essence: "Hard drives are thought to be fragile, but newer technologies seem to ameliorate that problem." With supporting refs. We don't need an interpretive footnote explaining mathematically why this is so, and we don't need an editor's narrative interpretation of material not expressly stated in the sources, to assist perfectly intelligent readers to understand the implications of one of the source's tests. Please just let this go. JohnInDC (talk) 02:15, 22 July 2018 (UTC)
JohnInDC's effort to "remove stray fact", evidently in response to the last paragraph of my "01:23, 22 July 2018 (UTC)" comment, went only as far as deleting the last sentence in "Floppy disk and its derivatives"—the one I had added expressly to contain the reference justifying "... kept them ['superfloppy' and a related 'non-floppy' devices] useful for backing up far longer" in the preceding sentence. I'm shocked that JohnInDC would delete a ref for a relevant fact that is still in the article, but I re-inserted the ref—now suitably page-numbered for its several cites—into the preceding sentence. DovidBenAvraham (talk) 00:22, 23 July 2018 (UTC)
The source only indirectly, and by inference, supports the assertion, but I'm fine with your having restored it. JohnInDC (talk) 00:33, 23 July 2018 (UTC)
Less than two years ago the head of Retrospect Technical Support posted, on Retrospect Inc.'s forums, "A large number of customers use Removable Disk Backup Sets and have existing removable disk backup sets. This backup set format is not limited to only the media types listed in the dialog box." AFAIK I can't use that post as a ref, but it gives me confidence in making the assertion. — Preceding unsigned comment added by DovidBenAvraham (talkcontribs)
Well, what that, and the reference, establish is that Retrospect still supports super floppies. That's about it. One developer's arguably anachronistic design decision doesn't say much of anything about the continuing utility of super floppies as a backup medium generally, or how long they've managed to hang on compared to floppies. Indeed, having just laid this out, I've revised the text to conform a little more closely to what this source tells us - namely, that one developer still supports the larger-capacity superfloppies. I've removed the assertions about their continuing utility, and for how long vis-a-vis standard floppies. JohnInDC (talk) 11:27, 23 July 2018 (UTC)
JohnInDC's edit of "Floppy disk and its derivatives" is an improvement. However his merging of two sentences about formats into one in "Magnetic tape" created a problem, because it changed the first clause of the combined sentence from past-perfect to present tense. I've changed it back to past-perfect tense because, as the reference in the next sentence makes clear, you now can't buy a new tape drive that supports any format other than LTO (unless you're an IBM mainframe user, and maybe not even then). I've added a link at the beginning of the merged sentence to a chronological list of tape formats, but (assuming the voice of Star Trek's Dr. Leonard "Bones" McCoy) "they're dead, John." DovidBenAvraham (talk) 21:09, 23 July 2018 (UTC)

Third opinion (reprise)

Response to third opinion request:
This issue was relisted. If it is ongoing, I suggest starting a thread at WP:DRN. Erpert blah, blah, blah... 12:24, 25 July 2018 (UTC)

I attempted to put in a DRN request, and got "Error: API returned error code "badtoken": Invalid CSRF token"—which I couldn't fix. The attempted request is as follows:

[ DovidBenAvraham (talk) 21:22, 26 July 2018 (UTC) has now deleted the text; I only "parked" the attempted DRN request here, because the "Error: ..." prevented me from inserting it using the DRN form where it was supposed to go. ]

DovidBenAvraham (talk) 14:18, 25 July 2018 (UTC)

I am of course biased in favor of my own position, but I suggest that you just let this go. The footnote is arcane & confusing, too detailed, borderline OR and Synthesis, and does not further illuminate the textual content. Not to mention that it is a very small issue in an article that (IMHO) already goes too far into the weeds. You've already got a Third Opinion that these footnotes are not helpful in general, and escalating this additional instance further up the chain is - again IMHO - a waste of everyone's time. I would also suggest, more broadly, that you venture out into the wider world of the encyclopedia and spend some time reading and trying to improve articles other than Retrospect (software)‎ and Backup, and interacting with editors other than me, in order that you might gain a better sense of how the project works and how disagreements are resolved. JohnInDC (talk) 15:09, 25 July 2018 (UTC)
I re-entered the DRN request with Safari, which eliminated the error message. Here's where JohnInDC should state his position. The Third Opinion was about my putting quotes into the references, which was an entirely different—and much bulkier—addition than a single footnote which only readers who choose to mouse over it will read. Unlike JohnInDC, my only aim is to edit WP articles whose subject I know something about. I'm not quarreling with his "sense of how the project works", only saying that IME it sometimes conflicts with what I think the readers are reasonably entitled to—without real OR and/or Synthesis. A few days ago I was greeted, when logging on to WP, with an announcement of some kind of online symposium IIRC about where Wikipedia should be going. Judging from remarks about Wikipedia made recently by my friends and acquaintances, I'd say those potential readers are not too happy with "how the project works" these days. DovidBenAvraham (talk) 19:07, 25 July 2018 (UTC)
DovidBenAvraham, the new thread is supposed to actually be opened at DRN, not here. Erpert blah, blah, blah... 20:06, 26 July 2018 (UTC)
He did, eventually. JohnInDC (talk) 21:07, 26 July 2018 (UTC)
I knew that when I wrote the text that is now deleted from my "14:18, 25 July 2018 (UTC)" comment. As stated at the top of that comment, I couldn't enter that text in the DRN form because doing so was giving me an error—so I "parked" it in this sub-section. I eventually figured out that switching Web browsers might eliminate the error; it did, so I've now deleted the text I "parked" here. DovidBenAvraham (talk) 21:22, 26 July 2018 (UTC)