I'm absolutely delighted by this - I'd long since given up hope of a mainstream vendor looking at a cached client approach rather than trying to insist on either thick or thin.
The issue of backing up user home directories from mobile devices in mentioned in the PDF, but I'd be very interested in hearing about approaches for dealing with syncing shared data stores.
With a home directory or private data store basic sync works if it can be made transparent to the user, and/or integrated into the GUI (Off-line Files on MS Windows, Mac iDisk).
Writable shared data stores presumably require some form of reconciliation or version control for multiple off-line copies to be handled smoothly. I suppose that one solution might be to wrap access to a version control system into the standard GUI so that the functionality becomes accessible to office workers; another might be to use a database-backed system like Storage, which could replicate.
How are people dealing with this issue on their networks now ? What do we think would be the "Just Works" solution in the context of Stateless Linux ?
(I've seen the problem, but I don't know the answer...)
-- Stuart Ellis s.ellis@fastmail.co.uk
On Tue, 2004-09-14 at 13:20 +0100, Stuart Ellis wrote:
The issue of backing up user home directories from mobile devices in mentioned in the PDF, but I'd be very interested in hearing about approaches for dealing with syncing shared data stores.
With a home directory or private data store basic sync works if it can be made transparent to the user, and/or integrated into the GUI (Off-line Files on MS Windows, Mac iDisk).
Writable shared data stores presumably require some form of reconciliation or version control for multiple off-line copies to be handled smoothly. I suppose that one solution might be to wrap access to a version control system into the standard GUI so that the functionality becomes accessible to office workers; another might be to use a database-backed system like Storage, which could replicate.
How are people dealing with this issue on their networks now ? What do we think would be the "Just Works" solution in the context of Stateless Linux ?
A sentiment around the office is that the "merge" concept is impossible to sanely present to users, and so instead we should stick to "master copy" and "backups"
Here is one UI idea which may be attributable to Bryan and Seth or may be something they wish to disown.
In any case, imagine you have two laptops, "Thinkpad" and "Inspiron", and a workstation "Optiplex". The Thinkpad is currently on the network, and the Inspiron is disconnected. You might see the following on the desktop of each system, in place of the current "hp's Home":
Thinkpad Desktop:
[icon] Thinkpad
[icon] Copy of Inspiron
[icon] Optiplex
Inspiron Desktop:
[icon] Inspiron
Optiplex Desktop:
[icon] Optiplex
[icon] Thinkpad
[icon] Copy of Inspiron
So the "Copy of Inspiron" is a read-only backup of your Inspiron homedir (kept on some network share). The other icons are the actual homedirs on those systems. If you imagine I now connect the Inspiron, my Optiplex desktop changes in real time:
[icon] Optiplex
[icon] Thinkpad
[icon] Inspiron
And if I disconnect the Thinkpad, then instantly Optiplex displays:
[icon] Optiplex
[icon] Copy of Thinkpad
[icon] Inspiron
In the above example, the disconnected Inspiron didn't have "Copy of Optiplex" or "Copy of Thinkpad" but optionally you could have that, if you were willing to use the disk space to keep a disconnected copy rather than having "Copy of Foo" be a read-only network file share.
This example is pretty contrived; making up numbers, I bet conservatively the 90% or more case is the user has only one computer, and the 98% case is only two, and the 2% case is more than two. We might also want to think about the user having a user-specific laptop, and then using multiuser sort of desktop systems. In that case perhaps there's a single network NFS homedir used for all desktop systems, and then a homedir for each laptop.
Anyway, the basic point is there's only one writable copy of each homedir (the copy that lives on the laptop itself), and multiple read- only copies.
For a file share used by multiple users, a simple approach is that they have a read only copy on their laptop, and it changes to the writable actual share while they are connected. Or just vanishes while disconnected, if appropriate.
Just having reliable homedir backup, plus the above UI, would be very useful and dramatically better than what most people use today. Maybe a complex merge solution would be even better, but those solutions never seem to catch on even though they've been implemented many times...
Havoc
On Sep 14, 2004, Havoc Pennington hp@redhat.com wrote:
In any case, imagine you have two laptops, "Thinkpad" and "Inspiron", and a workstation "Optiplex".
Something you might want to throw into the brain-storming: consider removable disks. I like to keep most of my data in big, fast removable hard drives, such that I can quickly switch from desktop to notebook, by simply moving the disks around. Sure enough, I don't always want to carry the extra weight when I'm on the road, so I have local copies of the home dir that's in the removable disks in the notebook, and I do have a minimal home dir in the desktop as well, such that I can log into it even when the removable storage is connected to the notebook. How would you handle this scenario of having the master home dir in removable storage, with local (partial?) copies of the home dir in each box.
Just having reliable homedir backup, plus the above UI, would be very useful and dramatically better than what most people use today.
The only thing I don't quite like in this idea is the name. To me, backups are snapshots of filesystems that you archive for relatively long periods of time, often in multiple full and incremental copies. In this case, you're only keeping the most-recent snapshot of the filesystem, so I'd rather use a different name. I'm afraid no good suggestion occurs to me.
On Tue, 14 Sep 2004 19:41:52 -0400, Havoc Pennington hp@redhat.com wrote:
So the "Copy of Inspiron" is a read-only backup of your Inspiron homedir (kept on some network share). The other icons are the actual homedirs on those systems. If you imagine I now connect the Inspiron, my Optiplex desktop changes in real time:
Yes that's right.. im going to beat the dead horse. "Copy of whatever" sure sounds like a mirror to me... did you know that rdiff-backup can produce a full "up2date" mirror of the incrementally backuped directory, plus a hidden directory tree to store the incremental differences. The UI you describe sounds very much like a slick way to connect to a read-only network shared rdiff-backup'd home directory to me.
Now of course.. i dont have skill to make the UI slickness you describe work. But what if... I came up with a stupid hacky way to produce that sort of switching idea using a network mountable share of an rdiff-backup'd directory. A stupid little cronjob to detect if the real read-write mountpoint of interest unmounts, and then mounts the read-only rdiffbackup mountpoint from elsewhere on the network. And i created a dumb...very very very dumb automated rdiffbackup script that would attempt to keep the backup synced with the real mountpoint when it was active on something like a reasonable sync timescale once an hour or less. Would that interest you as a lead in into something more sophisticated?
-jef"i think the dead horse likes it"spaleta
On Wed, 2004-09-15 at 00:41, Havoc Pennington wrote:
On Tue, 2004-09-14 at 13:20 +0100, Stuart Ellis wrote:
Writable shared data stores presumably require some form of reconciliation or version control for multiple off-line copies to be handled smoothly. I suppose that one solution might be to wrap access to a version control system into the standard GUI so that the functionality becomes accessible to office workers; another might be to use a database-backed system like Storage, which could replicate.
<snip>
A sentiment around the office is that the "merge" concept is impossible to sanely present to users, and so instead we should stick to "master copy" and "backups"
It would certainly require some very hard thinking - I don't think that anybody has a good answer yet. In most cases it probably isn't really necessary, either. For me it just feels unavoidable for those sets of data with multiple authors.
For a file share used by multiple users, a simple approach is that they have a read only copy on their laptop, and it changes to the writable actual share while they are connected. Or just vanishes while disconnected, if appropriate.
Simplicity is definitely a key virtue here, for both the mechanisms and the presentation. This sounds right to me for most cases.
Just having reliable homedir backup, plus the above UI, would be very useful and dramatically better than what most people use today. Maybe a complex merge solution would be even better, but those solutions never seem to catch on even though they've been implemented many times...
I agree that integrating paired sync or backup really would be of immense benefit. It can be done with the technology available, and done better than the implementations in other OSes, as well. So I just jumped ahead to the next problem along...
As I see it, as technical users we end up using both rsync and CVS because we need both approaches. Or replacing CVS with Subversion, or tla... There also now seems to be a few products attempting to bring version control and file repositories to the nontechnical (IBM Workplace, MS Sharepoint and Volume Shadow Copy). I doubt that they will get it right, and usability will probably be the biggest failing. But there is clearly a present need to manage sets of files with multiple authors and remote or disconnected clients. The popular methods of file sharing right now are pretty awful for meeting those requirements.
This feels like the next logical consideration once you talk about client machines as interchangeable hosts for data that are capable of disconnected operation.
The amazing thing about Stateless Linux as described is that it can obviously be done; people on this list can go straight to talking about the detail because we know that the necessary fundamentals exist. The issue of data sharing is the one area where I couldn't think of any available technology that actually works well.