|
CSE 7350/5350 David W. Matula |
Exam Scheduling Project Total Points: 400 Updated:
May MMVI |
Due Date: Early: 20 point bonus for May 5 (12 Latest: May 12 ( Note
: After Friday, May 12 at |
1 Introduction and Summary
In this project you are asked to implement an algorithm for
exam scheduling that reduces to an algorithm for graph coloring. You are first
asked in Part I to prepare several randomized student exam lists as a
randomized benchmark set and to implement two algorithms for converting the
exam lists into conflict graphs. In Part II you are asked to implement the
smallest last coloring algorithm for vertex coloring of the conflict graphs.
The coloring algorithm is to be applied to the randomized benchmark set from
Part I and a second set of benchmark graphs provided for the project as
described in Section 2. The graph coloring algorithm is also to be applied to
several original graphs you are asked to create to exhibit the strengths and
weaknesses of the coloring algorithm.
Regarding algorithm efficiency you are asked to verify a linear time bound O(|V|+|E|) for
smallest last graph coloring and suitable time and space bounds for each
conversion algorithm you implement for preparing the conflict graph from the
exam lists. You are asked to document your implementations in the form of a
report to a technical manager with substantial use of graphic display of
results. Section 3 provides more details on the grading issues and the format
for the project report. Section 4 provides details on the exam list generation
implementation requested as Part I of the implementation. Specifications for
Part II on the graph coloring procedure implementation are given in Section
5.
2 Benchmarks
A Randomized Case Benchmark of 10 particular random graphs
output from Part I should be part of your graph set input to your Exam
Scheduling Program in Part II. You will also be provided with ten additional
benchmark graphs with which to test your Exam Scheduling Program in Part II.
Five of these graphs were created using a graph generation program developed by
Paul Bartholomew. Your program must find a smallest last vertex ordering and
vertex coloring for each of these ten benchmark graphs as well as the graphs
created in Part I and additional graphs your create as requested in Parts I and
II.
These additional graphs are available in both adjacency list
and adjacency matrix representations as ASCII files on the Engr UNIX network in
the directory: /users/project/cse/CSE5350.
3
Project Format
Your project submission should be equivalent to a 10 -15 page
word-processor-composed report. This should be initiated by a 2 - 3 page Executive Summary containing an Introduction and Summary, a Programming Environment Description,
and References as described in the
following.
The core of your report should contain the Reduction to Practice details and Result Summary as outlined in the
following. With reference to the On Output perspective on
the class home page, the output of Part I is to the “Believer” audience since
it is input to a second program, and your report should substantiate the
correctness of the implementation. The output of Part II has a dual audience of
“Manager” for the executive summary and “Rational Learner” for the core of the
report, and these audiences should be kept in mind. In summary, the report
should be oriented towards a technical manager with a wide perspective, say in Scientific
American magazine style. Grading partition:
3.1 Executive Summary
Introduction
and Summary (50 points)
Give a brief description, in layman's terms, of the project,
including your major result. Consider this section to be a professionally
worded marketing effort. That is, would the competent technical manager
(your audience) be likely to want to read on and try to understand your claimed
good results. Describe the strongest features of your implementation. Summarize
your use of illustrations, figures, tables, or computer graphics employed that
will reveal your results effortlessly to the reader. Give clear citations (e.g.
[1]) to your sources listed in the References
section including downloaded programs and previous student's projects that most
influenced your work.
Programming
Environment Description (20 points)
Give a description of the environment you used to develop the
program including both hardware and software components, i.e., the specific
type and brand of computer, the computer's processing speed, the amount of
memory, the operating system, the language, any special libraries, etc. The
description should include metrics on the program resource utilization allowing
judgment of “was the total used respectable or an overkill?”. The purpose of
this description is to allow comparisons and facilitate reproduction. Consider:
would a knowledgeable colleague be able to judge the scope of your work from
this description of the environment and resource utilization? The graphics
package used to present the report and the package used to create the graph
drawings should also both be described.
References
(30 points)
Provide an enumerated list of publications, texts, and
programs in typical reference style as a concluding part of the executive
summary, for example:
[1] D.W. Matula, Exam Scheduling Project, www.engr.smu.edu/cse/7350/, 2003
IMPORTANT: If you read previous student projects from the archives in the department office, you should include a reference to each of them in your reference list.
3.2 Exam Scheduling Report
Reduction
to Practice (120 points)
This is the core of your report and should cover the five
following topics appropriately integrated.
·
Data structure Design: Describe and illustrate the
representation and organization of data employed and discuss the relation to
the algorithms employed for both Parts I and II.
·
Algorithm Descriptions: Provide a description of the
smallest last vertex ordering and graph coloring algorithm sufficient for your
report to be self-contained to someone familiar with other graph search
algorithms.
·
Algorithm Engineering: Describe how the algorithm
implementation is engineered to achieve efficiencies in time and space and
provide arguments for the efficiencies. In particular, carefully substantiate the
linear time and appropriate optimal space bounds for the radix sort-merge,
smallest last vertex ordering, and vertex coloring, in each case where your
implementation realizes this bound.
·
Verification Walkthrough: Present a walk through for the
vertex ordering and graph coloring of Part II for an original graph having 10
to 14 vertices and 20 to 30 edges. Choose the graph to expose challenging
aspects of the scheduling algorithm and your implementation.
·
Algorithm Effectiveness: Include here a summary table of
results as an overview of the benchmarks and your additional test graphs. Give
your conclusion on the strengths and weaknesses of the conversion and coloring
algorithms on the one hand and your implementation on the other. Indicate
applications (generalized scheduling, etc.) where your implementation would be
strongest and what kind of input would be most difficult to handle. Your
analysis and conclusions relevant to your separately created test graphs can be
a focal point of this topic.
Benchmark
Result Summary (80 points)
You
should provide a standard output for each randomized benchmark graph created in
Part I as specified in Section 4. You should provide output for each graph
colored in Part II as specified in Section 5.
Display
(100 points)
These points are awarded on the basis of your overall use of
illustrations, figures, tables, bar charts, and computer generated diagrams in
your report. Hand drawn and/or computer generated figures may be utilized as
appropriate.
·
40
points will be related to your displays associated with the algorithm and data
structure explanations for both Part I and Part II, including the verification
walkthroughs.
·
60
points will be related to your displays and graph drawings associated with
output results, including both sets of benchmark cases and the supplementary
verification case studies you create.
4
Part I: Simulated Exam List Conflict Graphs
This subproject utilizes pseudo random number generation to
create simulated student exam lists. A variety of methods possessing different
time and space requirements will be employed to accumulate conflict data
on-the-fly and prepare the output conflict adjacency list data structure.
Input: N = number
of exams.
S = number of students.
K = number of exams per
student.
Output: N
= number of exams (vertices).
M = number of distinct
pair wise exam conflicts (edges).
DEN = density of non zero exam conflict
entries (range 0-1).
Degree distribution: Plot the number of vertices having
degree i as a function of i
for i =
0,1,…, max degree, whenever max degree
is at most 200. If max degree
is greater than 200, present the
distribution in some appropriate graphic form.
Graph Adjacency
Matrix: Plot the
adjacency matrix of the conflict graph for all
cases with N £ 200, and also for cases with larger
N if your display tools allow
larger presentations. For conflicts occurring more than once, you may
employ
shading or some other mechanism to suggest higher density for the
plotted
“point”.
General Procedure: For input data generation we shall pick K distinct exams
between 1 and N according to some specified distribution using pseudo random
number generation, and note the specific conflicts implied. Note that each student
generates K(K-1)/2 pair wise conflicts, many of which are duplicate conflicts
shared by other students.
For output we shall consider a variety of procedures for
counting and accumulating the number of duplicate conflicts and preparing the
required sequential adjacency list structure of conflict data. Various types of
summary information should be presented as requested. The graph adjacency
structure should be produced and saved for input to the scheduling procedure.
Develop the adjacency list conflict data structure for the 10
benchmark data sets created as specified, and also for two-to-four more such
cases that illustrate your implementation features in size and “quality”.
Implement method 1 for application when N £
200, and implement either method 2 or 3 and apply to cases where N >
200.
Universal
Set Incidence.
Multiset: Provide storage of the upper triangular portion of the N´N
conflict matrix. After the input is processed the element of the array
corresponding to the exams I, J (for 1 £ I < J £ N)
should contain the number of students taking both exams I and J. This matrix is
scanned with the nonzero entries indicating the I, J pairs to be entered into
the adjacency list conflict data structure for the scheduling procedure.
Discuss and run tests for the largest value of N for which the computer system
used for your implementation conveniently
provides space, e.g. N ³ 500 should be feasible for moderate
size systems. For Benchmark 1 only print the conflict frequency matrix.
Sequential
Adjacency Structure Construction.
Describe and develop a (linked) adjacency structure to
dynamically record conflicting exams and their frequency. For each on-the-fly
received conflict of exam I with J for I < J, search the adjacency structure
with I for J and: if J is found, increment the corresponding frequency counter
by unity; else append J as an appropriate new entry of the adjacency structure
with initial frequency count of unity.
Radix sort/merge.
(Optional if developed in class: 30 bonus points)
Employ the radix sort/merge procedure developed in class to
accumulate the frequency counts. For each merge report the size of the row
conflicts group and the resulting size of the adjacency structure after
accumulating that group.
Distributions
|
Skew: The density should look as follows: 1
N |
Two Tier: 50% of the density should be in the first 10% of
the range.
Ratio: 9:1 0 10% N 100% N |
Benchmark Data Sets
Benchmark N K Den Distribution
|
1 |
16 |
3 |
30% |
Skew |
|
2 |
100 |
5 |
20% |
Uniform |
|
3 |
100 |
5 |
50% |
Uniform |
|
4 |
100 |
5 |
80% |
Uniform |
|
5 |
200 |
5 |
50% |
Skew |
|
6 |
200 |
5 |
50% |
Two Tier |
|
7 |
300 |
5 |
10% |
Skew |
|
8 |
300 |
5 |
10% |
Two Tier |
|
9 |
600 |
5 |
2% |
Two Tier |
|
10 |
1000 |
5 |
1% |
Two Tier |
5
Part
II: Graph Coloring
This subproject utilizes the smallest-last coloring algorithm
to provide a vertex coloring of the conflict graph to implement the scheduling.
Input: A
graph in adjacency list form.
Output: A coloring of the vertices of the
graph, suitably presented to an audience of type “rational learner” that wants
to visualize and understand the results for possible application of the program
as implemented.
5.1 Tables and
Displays in Report
Your report should include the following for each graph,
along with any additional information you feel would “sell” your
implementation.
Tables
Header:
A description of the graph and key parameters, such as N, S, K, M, DEN,
and probability type for each randomized graph obtained form Part I. For the
addition ten benchmark graphs the corresponding parameters N, M, DEN, and some
additional descriptive graph date, such as the distribution of vertex degrees.
Walkthrough Tables: For graphs with at most
200 vertices tabulate for each vertex in
the order colored (1) the vertex label , (2) the degree, (3) the degree
when deleted, (4) the number of adjacent colors when color assigned, (5) the
final number of adjacent colors, (6) the vertex color. NOTE: If this data is not printed for vertices in
the right order there will be a 40 point penalty.
For graphs
with over 200 vertices provide summary data with regards to the above, such as
averages or distribution functions for parameters (2) – (5). Provide a coloring
algorithm walkthrough for randomized benchmark 1 (16 vertices) illustrating the
order deleted by 15 successive cuts. For all graphs provide a size distribution
table for the independent sets of vertices colored with color j for j = 1, 2, …, maxcolor.
Display
Use a graph drawing program to draw
the graphs where possible, illustrating vertex colors either by distinct colors
(preferred) or integer labels.
5.2 Smallest Last
Coloring Implementation Outline
The coloring algorithm may be
summarized with reference to the following vertex records. The variables stored
in fields 3 and 4 are illustrated to change dynamically at different stages in
the algorithm to emphasize the distinct features of each stage. Fields 1 and 2
remain constant preserving the graph.
Record
for Vertex I, 1 £
I £
N.
|
Vertex
I |
Field
1 |
Field
2 |
Field
3 |
Field
4 |
|
Degree
|
Pointer
to edge list |
(i) Current Degree (ii) –1 when deleted (iii) Color Value |
(i) Pointers for doubly-linked list of vertices
of same current degree (ii) Pointer for order deleted list (iii) Number of adjacent colors when color
assigned |
The
smallest last vertex ordering algorithm may be implemented by recursively
deleting a vertex of smallest current degree, updating the third and fourth
fields of the record for each adjacent undeleted vertex, and adding the vertex
deleted to the ordered list of vertices deleted. After all vertices have been
deleted, the coloring algorithm can assign colors (exam periods) to each vertex
in an order opposite to the order deleted, assigning for a “color” the smallest
non-negative integer not previously assigned to any adjacent vertex.