Candidate Projects

CIS 422/522, Fall 2005


MIME separator | Activity recommender

Candidates

Usually I do not give you a choice about the first project. Having one project picked out ahead saves some time in getting started, which is important given the short time available. This term, though, I am going to try giving you some limited options. All teams will still do the same project, but you will have a chance to discuss and vote on which of the following projects that will be.

We will select a project by "rough consensus", rather than a simple majority vote. I will ask you to indicate both what you prefer and the strength of your preference, and I will attempt to select a project that is preferred by many and acceptable to as many as possible.

Candidate 1: MIME attachment separator

Synopsis

Batch filter of an IMAP mailbox, replacing embedded MIME attachments by links (urls) to separate files.

Background

MIME is the system used to combine attachments with electronic mail messages. The MIME encoding allows an email message to have several parts, and some of those parts can be attached documents, such as pictures, word processor documents, etc.

IMAP is a set of standard mail services and a protocol for coordination between a mail server and a mail client program (e.g., Eudora, Mulberry, Pine, Outlook, Apple Mail, etc.). An IMAP server keeps a master copy of a user's mailboxes. IMAP clients can synchronize their local copies of mail messages with the server, and can communicate with the server to delete, modify, and move mail messages, among other functions. Each IMAP mailbox on the server is typically a single text file in "mbox" format containing zero or more email messages. The email messages, including MIME attachments if any, are simply concatenated together in the mailbox.

POP (post office protocol) is a set of mail services and a protocol similar to IMAP, but older and less rich. Many mail servers and mail clients support both POP and IMAP. The main difference is that IMAP can manage true synchronization between the server and several clients for the same email account. POP provides retrieval of messages, rather than true synchronization, so that (for example) if I have a laptop computer and a desktop computer, they may retrieve different subsets of my email.

A typical configuration is a Unix or Linux mail server providing storage and IMAP functions for email clients on personal computers running Windows, MacOS, or Linux. For example, Darkwing.uoregon.edu provides IMAP and POP service (as well as webmail service) to many U.O. students who read email on their personal computers.

The Problem

Some users keep a lot of email, sometimes filed away in several folders. Attachments in the messages can take a lot of space, and may be difficult to manage. For example, some people send several successive drafts of a document back and forth in email messages. Several months later, they may still want the email messages as a record of the collaborative editing, but they may prefer to discard the many drafts of the document or archive it on other media. Doing this message-by-message is a pain, and some mail clients don't even provide a function for removing an attachment without deleting the email message.

Target Users

My wife is a typical target user: She receives email from friends and family with attached pictures, and sooner or later she receives a warning that her disk quota has been exceeded. Currently, when that happens, she has to look at each message with an attachment, optionally dragging the attachment to a folder on her macintosh before deleting the whole message. In an ideal usage scenario, she would instead go to a web page that has been set up by a system administrator, fill in a web form, and have those attachments moved to files in her web space. Then she should be able to browse the attachments by name, size, date, or kind, and choose to delete or compress some of them.

The scenario above implies another target user: Someone who has set up the mime detacher for use by people who use email but who do not use Unix tools on the server. This could be a system administrator, but ideally any half-competent Unix user (e.g., me) could install and configure it for him or herself and for others.

Key Product Features

I want a program that can be run on a Unix mail server on a particular IMAP mailbox to separate attachments from email messages. The attachments should be placed in separate files in a user-selected directory, and in place of the actual content, the email messages should be modified to contain URLs from which the actual content can be retrieved. Note that this implies some kind of configuration option for determining URLs for the new location of the documents, as well as some care to manage file names and prevent them from colliding (e.g., the third attachment called "document.doc" should not clobber the first and second attachment with that document name).

Desirable and Optional Features

There are many possibilities for expanding beyond the bare minimum functionality. Among them:

Competing Products and Alternatives

Demime and Stripmime are filters that strip out MIME attachments for mailing lists. I have heard of MIME filters that strip executable attachments as part of firewall protection, and I believe some of these perform essentially the basic transformation described above, moving the content into separate web-accessible files and replacing them with URL references in email messages. I did not find an example of the latter in a search of SourceForge, and am not certain whether any open source version of such a tool is currently available.

Technical Notes and Additional Information

There are several available open-source libraries for manipulating MIME email files. It is also likely that parts of Stripmime or Demime can be repurposed, though I have not inspected them to determine this.

One must be careful when modifying "live" mailbox files that may be concurrently accessed or modified by the mail server, to avoid file corruption or lost updates.

Candidate 2: Activity Recommender

This candidate is based on a sketch by Prof. Stephen Fickas, and is related to one of his current research projects in assistive technology. I have edited the sketch into this form, and have added some details that may not fit his vision. If we select this project, we will invite Prof. Fickas and/or his research assistants and collaborators to answer questions of clarification.

Synopsis

Manages and presents to a subscriber a set of activities selected by a caregiver.

Background

There are individuals who, because of age or disability, have particular difficulty getting out of their home. Not knowing of appropriate, accessible activities, they often stay home and watch TV. There are other individuals who are willing to help by watching for communitiy events that the subscriber might be interested in. The proposed system is to facilitate that interaction, making it as efficient and effective as possible.

Target Users

There are two stakeholders in this system: The end user (who we will refer to as the "subscriber") and a service provider (who we will refer to as the caregiver).

While a system of this kind could be useful for many audiences, the initial target subscriber is aperson who spend a great deal of time at home because of age or disability. The subscriber may reside in their own home, alone or with others, or in a nursing home or a group home setting. They are physically able to leave the home environment, but it may be more difficult for them than for most people (e.g., it could involve arranging a ride in advance).

The caregiver may be a volunteer who helps a single subscriber (perhaps a family member) or several, or an employee of a for-profit or non-profit organization that serves a larger number of subscribers. The caregiver is aware of the interests of individual subscribers, as well as special circumstances (e.g., limited mobility) that may make some events and activities more accessible than others.

Key Features

At a minimum, the system provides a caregiver interface for entering information about events and activities deemed of interest to a particular subscriber, and a subscriber interface (perhaps web-based) for perusing and selecting activitiies. Information associated with each event should include

The end-user can select an item, delete it, or leave it unselected. An unselected item is automatically deleted when its shelf life expires. A selected item generates one or more email reminders before the event. Additionally, a record of subscriber items is kept for the caregiver.

The caregiver interface supports basic record keeping and notes about the end user (interests, etc.) as well as entry of events and feedback to the caregiver regarding which events the subscriber has selected, deleted, and ignored.

Optional Features

Optional features of the caregiver interface could include aids in finding events (automatic scanning of news and community resource web sites, for example) and managing the event lists of several subscribers efficiently.

Technical Notes

This system could be web-based, or both interfaces could be Java applications. Professor Fickas has a Java client-server framework that has much of the networking done if you want to take that tack. A flat-file database is sufficient for the relatively small amount of data involved. A more ambitious approach might involve MySQL in the back end and JSP for a web-based interface.

Ideally the system would not only be suitable for the intended target audience, but flexible enough to be easily adapted to similar services for other subscriber audiences (e.g., cyclists or opera fans).


Last edit: Sun, 25 September, 2005 / Version identifier: $Id: candidates.html,v 1.1 2005/09/26 02:41:02 michal Exp $