Differences between revisions 2 and 22 (spanning 20 versions)
Revision 2 as of 2015-03-26 19:20:15
Size: 65018
Editor: ?PetterReinholdtsen
Comment: First draft translation.
Revision 22 as of 2015-05-23 19:56:23
Size: 61509
Editor: ?PetterReinholdtsen
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Line 6: Line 5:
After the office is up and running with a sensible workflow for tickets (user requests and troubleshooting) you will move on to the biggest challenge for the organization. As a rule, this is either change management or problem solving. Organizations with "cowboy" system administrators who come up with smart ideas and implement them without much testing, often begin with change management. For organisations suffering recurring outages, problem solving comes first. After the office is up and running with a sensible workflow for tickets (user requests and troubleshooting) you will move on to the biggest challenge for the organisation. As a rule, this is either change management or problem solving. Organisations with "cowboy" system administrators who come up with smart ideas and implement them without much testing, often begin with change management. For organisations suffering recurring outages, problem solving comes first.
Line 22: Line 21:
This way, users get one point of contact, and service desk operators get an overview of all of the cases. Operations can be expected to troubleshoot across all parts of the organization. Periodically the team leader needs to go through all issues and solutions in order to prioritize debugging and to prevent re-occurrence of errors, in order to provide schools with a stable operating environment. This way, users get one point of contact, and service desk operators get an overview of all of the cases. Operations can be expected to troubleshoot across all parts of the organisation. Periodically the team leader needs to go through all issues and solutions in order to prioritize debugging and to prevent re-occurrence of errors, in order to provide schools with a stable operating environment.
Line 32: Line 31:
Thus the log of requests is a basic and necessary tool both for users and the service desk. There are several freely available systems for logging requests with good documentation <ref>RT Essentials: http://www.oreilly.com/catalog/rtessentials/chapter/index.html
</ref>. Skolelinux Drift uses RT <ref>RT Essentials: http://www.oreilly.com/catalog/rtessentials/chapter/index.html
</ref> to handle requests.

One important thing when starting up support is not to get a too tough start. Do not try to achieve everything at once. Bet rather on &quot;quick wins&quot; that keep the user informed, and short response times. It is also important to clarify who Service Desk should forward events to, if they can not figure out the inquiry themselves. The support must also be able to see if there are disruptions for the user. This makes it quick and easy to give feedback.

For the users it is important that incidents are handled. For the service office it is important that the incidents are handled correctly according to the service level agreement, and that work requested outside what is already agreed upon is handled between leaders at the school and the system administration organisation.
Thus the log of requests is a basic and necessary tool both for users and the service desk. There are several freely available systems for logging requests with good documentation <<!FootNote(RT Essentials: http://www.oreilly.com/catalog/rtessentials/chapter/index.html
)>>. Skolelinux Drift uses RT <<!FootNote(RT Essentials: http://www.oreilly.com/catalog/rtessentials/chapter/index.html
)>> to handle requests.

One important thing when starting up support is not to get too tough a start. Do not try to achieve everything at once; bet rather on &quot;quick wins&quot; that keep the user informed, and aim for quick response times. It is also important to clarify who the service desk should forward events to, if they can not solve the issue themselves. The support desk must also check whether there will be disruptions for the user. This makes it quick and easy to give feedback.

For the users it is important that incidents are dealt with. For the service office it is important that the incidents are handled correctly according to the service level agreement, and that work requested outside of what was agreed is handled between management at the school and the system administration organisation.
Line 42: Line 41:
We recommend to agree upon what duties the school's ICT contact has and what is the responsibility of those who work at the Service Desk. Schools often have little resources compared to what is common in municipal administrations or private companies. At the same time, one usually has many more users and often more client machines than what is in use in the rest of the municipality.

To distribute tasks must have in place roles. By having clearly established roles is easier to distribute tasks, and the working capacity necessary to resolve operational tasks in a good way. From experiences in municipalities and professional organizations operating shows that these roles are common.
We recommend to agree upon what duties the school's ICT contact has and what is the responsibilities are of those who work at the Service Desk. Schools often have little resources compared to what is common in municipal administrations or private companies. At the same time, schools usually have many more users and often more client machines than in use in the rest of the municipality.

To distribute tasks roles must be in place. By having clearly established roles it is easier to distribute tasks and ascertain the working capacity necessary to resolve operational tasks. Operational experience in municipalities and professional organisations shows that these roles are common.
Line 90: Line 89:
The advantage of an agreement for these tasks is that one both know what is expected of the individual, - and has a good basis for planning and managing ICT services. Usually these ICT tasks is done only as a part of the job of a teacher who also have teaching duties.

A business would often have two staff members working full time opeerating 100 standard client machines with 100 users. In schools maybe a 30% position in total, divided among several persons, operate 100 client computers used by 320 students and teachers.

When the school has so few resources to operation, it is crucial to have good management of resources. Making agreements for the tasks, can make it easier to assess whether you need additional resources or to reduce expectations of IT initiatives in schools from budgetary considerations. By having a good overview of the ICT tasks in the school, IT administrators could easier ask for an increase in resources if necessary. There may be a need for increased resources to implementate ICT-based exams or a need for new equipment like whiteboard, as an aid in teaching.
The advantage of an agreement for these tasks is that expectations on the individual are known, giving a good basis for planning and managing ICT services. Usually these ICT tasks are only done part-time by a teacher who also has teaching duties.

A business would often have two staff members working full time, operating 100 standard client machines with 100 users. In schools there may be a 30% position in total, divided among several persons, operating 100 client computers used by 320 students and teachers.

When the school has so few resources for operations, it is crucial to have good resource management. Making agreements for the tasks can make it easier to assess whether you need additional resources, or to reduce expectations of IT initiatives in schools with regards to the budget. By having a good overview of the ICT tasks in the school, if would be easier for IT administrators to ask for an increase in resources if necessary. There may be a need for increased resources to implement ICT-based exams, or a need for new equipment like whiteboards as teaching aids.
Line 98: Line 97:
We've created a table showing time spent on operation and maintenance. The table is based on the experiences of municipalities that central operates Debian Edu of 9-10 schools with 250-500 client computers. Several things are not included in the table. That's why one must set up extra time to projects where ans when schools develop ICT solutions with network and more equipment.

<table>
<tbody>
<tr class="odd">
<td align="left">'''''Role'''''
</td>
<td align="left">'''''Operational responsibilty'''''
</td>
<td align="left">'''''Time spend per school per week'''''
</td>
<td align="left">'''''Time spent in toal for all schools'''''
</td>
</tr>
<tr class="even">
<td align="left">Operation manager centrally
</td>
<td align="left">
Monitoring, debugging and operation of 500 machines, for example, 10 schools with 3,200 students and teachers.
</td>
<td align="left">
2-3 h

(50 clients)
</td>
<td align="left">
½ position

(500 clients)
</td>
</tr>
<tr class="odd">
<td align="left">
ICT contact at each school
</td>
<td align="left">
Oversight of equipment, easy maintenance, and reporting of incidents and requests
</td>
<td align="left">
3-4 h

(50 clients)
</td>
<td align="left">
1 position

(10 schools / 500 clients)
</td>
</tr>
<tr class="even">
<td align="left">
ICT-coordinator sentrally
</td>
<td align="left">
Assist in planning and implementation of educational and technical ICT work in the school.
</td>
<td align="left">
1-2 h
</td>
<td align="left">½ position
</td>
</tr>
<tr class="odd">
<td align="left">
ICT manager (principal)
</td>
<td align="left">Make joint purchases, and make sure the compliance of the service level agreement. Schedule updates, or developing solutions
</td>
<td align="left">1 h
</td>
<td align="left">¼ position
</td>
</tr>
<tr class="even">
<td align="left">
'''Overall for school'''
</td>
<td align="left">'''50 client machines (concurrent users)'''
</td>
<td align="left">'''6 - 10 h'''
</td>
<td align="left"></td>
</tr>
<tr class="odd">
<td align="left">
'''Overall for all schools'''
</td>
<td align="left">
'''10 schools, 500 client machines (concurrent users)'''
</td>
<td align="left"></td>
<td align="left">
'''2 ¼ position'''
</td>
</tr>
</tbody>
</table>


Experience shows that the scope of work of the ICT contact is affected by the number of concurrent users. The term &quot;concurrent users&quot; is new to many. To illustrate with an example: A school may have 250 students but not more than 50 computers. Then a maximum of 50 students can use computers at the same time. This is much less than the total 250 users who have an account on the system. It is these 50 logged in users that provides work for IT service. The other 200 people not logged in give little extra work.

Therefore, it is common to calculate IT costs from the maximum number of concurrent users. Other calculation methods are also possible, for example when paying for proprietary software. But since Debian Edu has no license costs, the number of concurrent users is the most crucial for operating costs. To calculate costs from user accounts provide little or no meaning in school.

For users of Debian Edu the cost difference to manage 100 or 250 user accounts is very small. There are a few exceptions. With 250 students in stead of 100, some more studens may constantly forget their password. Therefore, it is wise to let the teacher responsible for the class to give these students the new password.

If the school has 50 client machines, the ICT contact needs less time on their operational tasks than if the school has 150 clients. With multiple clients increases the overall time spent on the operation. But operating time per client machine goes down somewhat.

Several municipalities have set aside 3-4 hours a week to the ICT contacts tasks at each school when it is installed 30-70 client machines. The Education Department in Oslo has set aside half weekday, or 30% position to follow up 150 client machines. Experiences from other municipalities suggests that a 20% position is enough to solve the tasks of a local ICT contact when a school has 160 thin or diskless clients with Debian Edu.

In addition, the costs of centralized operations, ICT management, and construction of the educational use of ICT tools in school subjects. Probably it is enough with one position for the operation of 1000 client machines. When it comes to educational support, several principals have a 50-100% position in the school for this work. There may be a 10-20% position as ICT contact and a 40-80% position as educational support for the teachers. Many teachers perceive IT tools in schools as something new. Some principals wish to give more backing to the educational side by making teacher more confident in using IT tools in the different subjects.
We've created a table showing time spent on operations and maintenance. The table is based on the experiences of municipalities which implement a centrally operated Debian Edu of 9-10 schools with 250-500 client computers. Several things are not included in the table. Therefore extra time is required for projects where schools develop their own ICT solutions with networking and more equipment.

||'''''Role'''''||'''''Operational responsibility'''''||'''''Time spend per school per week'''''||'''''Time spent in total for all schools'''''||
||Centralised operations staff||Monitoring, debugging and operation of 500 machines, for example, 10 schools with 3,200 students and teachers.||2-3 h(50 clients)||½ position(500 clients)||
||ICT contact at each school||Oversight of equipment, easy maintenance, and reporting of incidents and requests||3-4 h(50 clients)||1 position(10 schools / 500 clients)||
||Central ICT-coordinator||Assist in planning and implementation of educational and technical ICT work in the school.||1-2 h||½ position||
||ICT manager (principal)||Make joint purchases, and ensure compliance with the service level agreement. Schedule updates, or develop solutions||1 h||¼ position||
||'''Overall for a school'''||'''50 client machines (concurrent users)'''||'''6 - 10 h'''||||
||'''Overall for all schools'''||'''10 schools, 500 client machines (concurrent users)'''||||'''2 ¼ position'''||

Experience shows that the scope of work of the ICT contact is affected by the number of concurrent users. The term &quot;concurrent users&quot; is new to many. To illustrate with an example: A school may have 250 students but not more than 50 computers. Then a maximum of 50 students can use computers at the same time. This is much less than the total 250 users who have an account on the system. It is these 50 logged in users that provide work for IT service. The other 200 people not logged in give little extra work.

Therefore, it is common to calculate IT costs from the maximum number of concurrent users. Other calculation methods are also possible, for example when paying for proprietary software. But since Debian Edu has no license costs, the number of concurrent users is the most crucial figure for operating costs. To calculate costs from user accounts provide little or no meaning for a school.

For users of Debian Edu the cost difference to manage 100 or 250 user accounts is very small. There are a few exceptions. With 250 students instead of 100, some students may repeatedly forget their password. Therefore, it is wise to let the teacher responsible for the class give these students a new password.

If the school has 50 client machines, the ICT contact needs less time on their operational tasks than if the school has 150 clients. With multiple clients, the overall time spent on the operation increases, but operating time per client machine goes down somewhat.

Several municipalities have set aside 3-4 hours a week to the ICT contacts tasks at each school with 30-70 client machines. The Education Department in Oslo has set aside half a weekday, or a 30% position, to follow up 150 client machines. Experiences from other municipalities suggests that a 20% position is enough to solve the tasks of a local ICT contact when a school has 160 thin or diskless clients with Debian Edu.

In addition there are associated costs of centralized operations, ICT management, and construction of the educational use of ICT tools in school subjects. One position is probably sufficient for the operation of 1000 client machines. When it comes to educational support, several principals have a 50-100% position in the school for this work. There may be a 10-20% position as an ICT contact and a 40-80% position as an educational support for the teachers. Many teachers perceive IT tools in schools to be something new. Some principals wish to give more backing to the educational side by making teacher more confident in using IT tools across the different subjects.
Line 195: Line 121:
We have sat up a list of tasks to be resolved to put in place a proper Service Desk.

 * Get in place people in different roles that IT manager, IT contact in schools, operator(s) centrally and IT coordinator for all schools. It is important to make a distinction between what is technical operations and maintenance, - and the pedagogical work.
 * Etabler servicekontoret der hver skole har en serviceavtale som regulerer hva som er standard driftsaktiviteter, og hva som er ekstra. Det er avgjørende at IT-ansvarlig rektorer et med i denne prosessen hele veien.
 * Establish system for request tracking. All inquiries on email get a case number. Almost all inquiries from users or IT contacts at schools calling in also get a case number.
 * Sørg for at IT-budsjettet gjenspeiler den arbeidsinnsatsen som er nødvendig for å sikre forsvarlig drift av skolens datautstyr og nettverk. Kravet i dag er at IT-systemene skal brukes til nasjonale og lokale prøver med bruk av IT-verktøy med eller uten Internett.
 * Bruk i utgangspunktet standard utgave av Skolelinux med samme versjon på alle skolene. Ut fra dette gjøres endringene man ønsker. Disse endringene må man ta vare på i en konfigurasjonsdatabase med dokumentasjon av de endringene som er gjort. Man kan bruke et system for versjonshåndtering for å lagre endringene og dokumentasjonen.

== Hendelseshåndtering (Incident Management) ==

Hensikten med IT-tjenesten er å hindre forstyrrelser som driftsstans eller redusert kvalitet ved bruk av dataprogrammene. Brukerne vil oppleve få problemer med IT-systemet om IT-tjenesten har nok ressurser til drift, utstyr og henvendelser til servicekontoret. Allikevel skjer små eller store feil som gir forstyrrelser for brukerne. Da trenger man god håndtering av hendelser.

I fallskjermmiljøet kaller man nestenulykker for «hendelser». Det er kanskje ikke helt det samme i datadrift når noe ikke virker. Hensikten med å håndtere hendelser er og gjenopprette tjenesten så raskt som mulig slik at alt virker på vanlig måte. Går noe galt skal dette ha minst mulig å si for brukerne. Hva som er en normal tjeneste er avtalt gjennom en driftsavtale som beskriver tjenestenivå.

Statistikk over hendelser er viktig. Spesielt om flere jobber med driften. Når flere jobber sammen mister man fort oversikten over alle sakene. Statistikk vil kunne påvise problemområder som må håndteres mer grundig enn en rask løsning fra servicekontoret. F.eks. kan det være mange henvendelser om å bytte passord til elever som har glemt dette. Da kan det være lurt å la læreren til klassen bytte elevens passord.

En driftsforstyrrelse er definert som:

 * en hendelse som ikke er en del av den normale driften som forårsaker, eller kan forårsake et avbrudd, eller reduksjon i kvaliteten på tjenesten.

Eksempler på driftforstyrrelser kan være:

 * Programmer
  * kontorprogrammet (OpenOffice.org) starter ikke
  * nettleseren (Firefox) kræsjer
  * disklageret er fullt
 * Maskinvare
  * tjenermaskinen er nede
  * får ikke skrevet ut
  * får ikke logget inn
 * Henvendelser
  * spørsmål om informasjon, råd eller dokumentasjon
  * glemt passord

Eksemplene viser noen av de vanligste driftsforstyrrelsene. Dette er problemer som gjør at brukere henvender seg til IKT-kontakten på skolen eller servicekontoret. IT-tjenesten må prioritere hva som skal behandles med en gang, og hvilke problemer som trenger mer tid for å løses. For å prioritere hvilke problemer som trenger mer omfattende feilretting er det viktig å logge alle henvendelser om driftsforstyrrelser. Når man har oversikt over hvilke driftsforstyrrelser det er mest av, kan man sette inn tiltak på de områdene der det er mest problemer.
We have sat up a list of tasks to set up a new service desk.

 * Arrange people in different roles like IT manager, IT contact in schools, central operations and IT coordinator for all schools. It is important to make a distinction between what is technical operations and maintenance, and what is pedagogical work.
 * Establish the service desk such that every school has a service agreement regulating what is standard operating activities, and what is extra. It is imperative that ICT-responsible principals are a part of this process.
 * Establish a system for handling incoming requests (a request tracker). All enquiries by email need a case number. Almost all enquiries from users or IT contacts from schools also need a case number.
 * Ensure that ICT budget reflects the contribution necessary to ensure proper operation of school computer equipment and networks. The requirement today is that the ICT systems will be used for national and local tests with use of ICT tools with or without the Internet.
 * Basically use the standard edition of Debian Edu with the same version on all schools. From this make the changes you want. These changes must be taken care of in a configuration database with documentation of the changes made. Version management can be used to save the changes and documentation.

== Incident Management ==

The purpose of the ICT service is to prevent disturbances like shutdowns or software issues. Users will experience few problems with the ICT system if the ICT service has enough resources to handle operations, equipment and for enquiries to the Service Desk. Small or big problems will cause interruptions for users, so good handling of incidents is necessary.

In parachuting they call near-accidents &quot;incidents&quot;. It is perhaps not quite the same in computer operations when something is not working. The purpose of dealing with incidents is to restore services as quickly as possible so that everything works normally. If something goes wrong, it must have the least possible impact on users. What is a &quot;normal service&quot; is agreed through an operating agreement describing the service level.

Statistics of incidents is important, especially if several people work within the organisation. When several people work together, it is easy to lose track of the work. Statistics will point out problem areas that must be addressed more thoroughly than a quick fix from the service desk. For example, there may be many requests to replace forgotten passwords, so it may be wise to let the teacher change passwords for pupils in their class.

An operational disturbance is defined as:

 * an event which is not part of normal operations and causes, or can cause, an interruption or reduction in the quality of the service.

Examples of operational disturbances may be:

 * Programs
  * the office program (!OpenOffice.org) does not start
  * the web browser (Firefox) crashes
  * the hard drive is full
 * Hardware
  * the server is down
  * unable to print
  * unable to log in
 * Requests
  * requests for information, advice or documentation
  * forgotten password

The examples show some of the most common operational issues. These are problems that prompt users to contact the school or the service desk. The ICT service must prioritize what must be handled straight away, and which problems need more time to resolve. To prioritize which problems need more comprehensive debugging, it is important to log all enquiries about malfunctions. Once one has an overview of the most common problems, appropriate actions can be taken.
Line 233: Line 159:
Vi har laget en kort huskeliste for å sikre at man har på plass rutiner og systemer for god hendelseshåndtering

 * Driftsoperatør som gjør feilretting er den som melder status tilbake til IT-kontakt på skolen og/eller bruker
 * Systemet for logging av hendelser må være på plass slik at det virker både teknisk og funksjonelt for de som jobber med hendelseshåndtering på skolene og på servicekontoret
 * Systemet for hendelseslogging må brukes for så og si alle driftshendelser
 * Ved jevne mellomrom lages statistikk av loggen over hendelser. Statistikken brukes for å sette inn tiltak som fjerner problemer som går igjen, og irriterer brukerne.

=== Planlegging og implementasjon ===

Å sette opp et brukbart system for logging av hendelser krever er noe mer enn å installere systemet. Alle i driftsavdelingen må bruke systemet. De som melder feil må også få tilbakemelding på e-post med saksnummer. Slike ting krever betydelig med konfigurering av systemet for hendelseslogging. I tillegg må man sørge for enkel brukeropplæring av de som tar imot henvendelsene.

Large and comprehensive plans are not need to put in place a proper event handling. To handle events is a completely standard task for those who work at the Service Desk or as ICT contacts at each school. To set up a computer tool for logging events may require up to a few weeks for a correct configuration, and users may also report events via e-mail and by phone.

Selve brukergrensesnittet til loggsystemet er relativt selvforklarende. Så det tar ikke mange timene å ta i bruk. I løpet av den daglige bruken av systemet vil man bli mer og mer komfortabel med hva som bør stå i meldingene som logges. Det er avgjørende at alle i driftsavdelingen bruker systemet for logging av driftsmeldinger.

=== Aktiviteter ved driftsforstyrrelser ===

For å få en idé om hvilke aktiviteter som gjøres ved en melding av en hendelse, bruker vi et eksempel.

En bruker kontakter servicekontoret med et problem. Utskriften virker ikke, er meldingen fra brukeren på telefon. Driftsoperatør logger hendelsen rett etter samtalen er avsluttet. Problemet med utskriften blir en sak med et saksnummer (som gis automatisk).

Driftsoperatør på servicekontoret gjør en rask analyse. Er det utskriftskøen som har stoppet igjen, eller er det noe annet? Det kan hende det mangler papir eller toner? Ved å undersøke utskriftskøen ser driftsoperatøren at den er fylt opp. Hun sletter køen, og ser om neste utskrift blir skrevet ut.

Denne gangen fylte skriverkøen seg opp igjen. Driftsoperatør kontakter skolens IT-kontakt, og ber om å sjekk om papir er på plass. Dette noteres i hendelsesloggen. IT-kontakten melder tilbake at papir er fylt på, og at utskriften går som normalt. Saken er avsluttet noe som også noteres i systemet for hendelseslogging.

Om skriveren ikke hadde startet igjen kunne det vært toner som manglet, eller at skriveren hadde en feil. Var det en feil måtte driftsoperatør skalert problemet. Med skalering menes at andre enn driftsoperatør og IKT-kontakten løser problemet. I dette eksemplet måtte man fått hjelp av en servicetekniker som kan fikse skrivere.

Dette eksemplet viser at det settes igang et helt apparat for å få igang en skriver som har stanset opp. Virker ikke skriveren selv om man har lagt inn mer papir i en i en tom skriver, så må man først undersøke om det er toner som mangler. Om alt ser ut til å være på plass, men ting fortsatt ikke virker, må man skalere problemet. Driftsavdelingen kaller inn en ekspert på et bestemt område for å fikse feilen. Denne gangen var det en servicetekniker for skrivere.

Hva som var feilårsak og den reparasjonen som ble gjort noteres i systemet for hendelseslogging.

=== Roller ===

Det er en rekke roller involvert når IT-tjenesten behandler meldinger om at noe ikke virker. I eksemplet over samarbeider skolens IT-kontakt og driftsoperatøren om å løse problemet med utskrift. Hadde problemet vært større måtte man ha innkalt en servicetekniker. Får man ikke fikset skriveren å må kjøpe ny. Må skolen skaffe ny skriver kan det hende man må involvere IT-ansvarlige for å få penger. Mange steder er det rektor som har siste ordet.

Kort sagt blir det fort mange som blir involvert når noe ikke virker. Man skal i utgangspunktet løse problemer der og da. Da unngår man å involvere mange som ikke kan bidra til å løse problemet. Skalerer man problemer som kan løses lokalt så blir det fort mer kostbart. Også fordi mange henvendelser bør være enkelt å håndtere der og da. Andre henvendelser dreier seg om mer sammensatte problemer. Da må man involvere flere personer. Er man avhengig av ekstra eller ekstern hjelp for å løse problemet så må dette som hovedregel avklares med driftsleder. Det viktige er å være bevisst ved håndtering av driftshendelser, og bruke ressursene på en god måte.

=== Nøkkelpunkter ===

Vi har satt opp en del nøkkelpunkter ved håndtering av hendelser. Punktene skal være til hjelp for å vurdere om man gjør en god jobb ut fra målbare og vel definerte krav. Slike målepunkter er:

 * Totalt antall driftsforstyrrelser
 * Gjennomsnittlig tid fra man har fått en henvendelse til at problemet er løst, brutt ned på koder (en godt organisert driftsavdeling har koder for hendelser og feiltyper).
 * Prosentvis av hendelser håndtert innen avtalt svartid (avtalt i avtalen om tjenestenivå)
 * Gjennomsnittlig kostnad for hver hendelse
 * Prosentvis av hendelsene løst av hjelpetjenesten uten å gå videre til neste nivå med driftsstøtte
 * Hendelser pr klientmaskin (arbeidsplass)
 * Antall og prosenter av hendelser som løses fra driftsenteret uten behov for besøk på skolen

=== Verktøy ===

Det er en rekke verktøy som kan gjøre det enklere å håndtere driftsforstyrrelser.

 * Automatisk logging
 * Automatisk dirigering av hendelser til riktige personer
 * Automatisk uthenting av data fra databasen for konfigurasjonsstyring
 * Telefonen og e-post virker enkelt sammen med verktøy for registrering av henvendelser og hendelser.

== Problemhåndtering (Problem Management) ==

Problemhåndtering er en «undersøkende» prosess. Kjente feil blir oftest håndtert direkte av servicekontoret. Dette er den mest vanlige formen for hendelseshåndtering. Ved ukjente feil må man undersøke nærmere hva som er galt. Denne form for feilsøking krever både sunn fornuft og teft. Gode driftsfolk bruker teften til å gå rett til problemet, finner løsningen, og gjenopprette tjenesten så raskt som mulig slik at alt virker på vanlig måte.

'''Problemhåndtering er;'''

 * Problemkontroll
 * Feilkontroll
 * Proaktiv kontroll for å hindre problemer
 * Identifisere feilmønstre ved å bruke informasjon fra f.eks. hendelseshåndteringen

'''Problem kontroll'''

 * Identifisere problemer
 * Klassifisere problemer
 * Etterforske/Undersøke problemer

'''Feil kontroll'''

 * Identifisere og registrere kjente feil
 * Finne midlertidige løsninger om mulig
 * Kontakte de med ansvar for endringsledelse for å få fjernet feilen permanent

'''Proaktiv kontroll'''

 * Identifisere og løse problemer og feil før hendelsen blir rapportert av brukere.
 * Bruke logger, informasjon fra hendelseshåndteringen for å se hvor problemer kan oppstå

=== Prosedyrer for problemhåndtering ===

Vi har lagt ved en omfattende samling av problemløsninger og oppskrifter for konfigurering. I løpet av sommeren 2006 vil dette også være lagt ut på Internett. Vedlikeholdet av oppskriftene vil skje av profesjonelle driftsoperatører på skoler, kommunale IT-tjenester og private driftsoperatører. For å gjøre det enkelt å gjøre forbedringer i dokumentasjonen er det hele lagt ut i en wiki som ligger på en Skolelinux-tjener.

Wiki-teknologien har vist seg å være en stor suksess for å vedlikeholde katalogisert informasjon på Internett. Det er enkelt å bidra og alle endringer loggføres. Det er også mulig å importere OpenOffice.org dokumenter, og eksportere oppskrifter som pdf.

== Konfigurasjonsstyring (Configuration Management) ==

Ressursene som brukes på IT-systemene i skolen må håndteres på en økonomisk forsvarlig måte. Da må man ha styring på tjenestene som brukes og utstyret eller infrastrukturen som det ofte kalles. Utstyret, programvaren og tjenestene har en hel rekke med innstillinger. Dette er konfigurasjoner, eller en logisk modell av hvordan infrastruktur og tjenester er satt opp.

To control the configurations they must be identified, saved and maintained. One must also be able to keep track of different versions of the configurations. We call each part of a setup for a Configuration Item (CI). A configuration file may, for example, ensure that certain users have access to a few printers in the network. Another can make sure you get a buffer on diskless clients.

En oppdatert database for konfigurasjonsstyring er avgjørende for å sikre rask og kontrollert behandling av driftsforstyrrelser, eller ønske om endringer i oppsettet av maskiner, programmer eller tjenester.

=== Planlegging ===

Det krever planlegging for å få på plass en database for konfigurasjons styring. Man må bestemme seg for områdene systemet skal brukes, målsetningen, politikken og prosessene for lagring og vedlikehold av konfigurasjonene.

 * Identifiser og velg en struktur på konfigurasjonene på de viktige delene av IT-infrastrukturen. Det gjelder også eiere av konfigurasjonen, navnelapper (attributter), avhengigheter, og relasjoner mellom konfigurasjoner.
 * Styringen med konfigurasjonene slik at bare de som er godkjente blir tatt vare på i databasen gjennom levetiden til systemet. Styring over tilgang til konfigurasjonene kan gjøres med gruppetillatelser. Dette kan gjøres gjennom prosessen for endringshåndtering (Change Mangagement).
 * Statuslogging - holder orden på tilstanden og status til de forskjellige delsystemer. Dette gjelder hele levetid til tjenesten, programvaren eller maskinvaren. Det kan være en konfigurasjon er i produksjon, er frakoblet eller avviklet.
 * Sjekking og revisjon. Hver konfigurasjon må sjekkes for å bekrefte at riktig informasjon er lagret i databasen for konfigurasjoner (CMDB). Dette følges opp med periodiske gjennomganger for å sikre at databasen hele tiden er oppdatert.

Som vi ser må man planlegge en hel del om man vil ha en god forvaltning av konfigurasjonene til IT-systemet. Hensikten ved å planlegge dette som en del av IT-driften er å sikre at systemer som går ned raskt kan komme på lufta. Har man god orden på konfigurasjonene er det lett å bytte ut en defekt maskin, og erstatte denne med en ny. Konfigurasjonene kan raskt overføres til den nye maskinen, og IT-systemet oppleves som like godt som før det gikk galt med den gamle maskinen.

=== Styring av delekonfigurasjonene ===
We have made a short check list to ensure procedures and systems for good event handling are in place.

 * The operator doing the debugging will report the status back to the ICT contact at the school and/or the user.
 * The system for logging events must be available and working (both technically and functionally) for those working with event handling in schools and at the service desk.
 * The event logging system must be used for virtually all operational events.
 * Statistics of the log of events should be made periodically. The statistics can be used to identify and eliminate recurring problems, which are irritating to users.

=== Planning and implementation ===

To set up a workable system for logging events requires something more than installing the system. Everyone in the operations department must use the system. Those reporting errors must also receive feedback by email with a ticket number. This requires significant efforts in configuring the system for event logging. In addition, one must ensure basic user training for those who receive the requests.

Large and comprehensive plans are not required to implement proper event handling. Event handling is a completely standard task for those who work at the service desk or as ICT contacts at the schools. Setting up a computer tool for logging events may require up to a few weeks for a correct configuration, and users may also report events via e-mail and by phone.

The user interface to the logging system is relatively self-explanatory, so it should not take too long to get started. Daily use of the system will get users comfortable with what should be logged. It is crucial that everyone in the operations department uses the logging system for operational messages.

=== Activities related to operational events ===

To get an idea of activities done following a reported event, we use an example.

A user contacts the service office with a problem, and reports that printing is not working. Operations logs the event immediately after the call is completed. A case is opened for the issue, and automatically given a case number.

Operations at the service desk make a quick analysis. Has the spooler stopped again, or is it something else? Is the paper or toner missing? The operator examines the spooler and sees that queue has filled up. She deletes the queue and tests whether the next job is printed.

This time the print queue fills back up again. Operations contact the school's ICT contact asking to check whether the paper tray is empty. This is listed in the event log. The ICT contact replies that they have refilled the paper tray, and printing is normal. The case is closed, and is noted in the system event log.

If printing had not started again, the toner might have be missing or there might have been a printer error. If there was an error, operations would have to escalate the issue. This means that someone other than the operator or the ICT contact is needed to resolve the problem - in this example, a technician who can fix printers.

This example shows the whole workflow that needs to be investigated to get a printer working again. If a printer does not work even after checking that paper and toner are available, the issue needs to be escalated. The operations department must call in an expert to fix the problem - this time it was a service technician for printers.

What was wrong and what the fix was are noted in the event logging system.

=== Roles ===

A variety of roles are involved when the ICT service deals with reported issues. In the example above, the school's ICT contact and the operator cooperate to solve the printing problem. Had the issue been more difficult, they would have had to call a technician. If the printer could not be fixed, a new one would have to be purchased. If the school needed to buy a new printer, the ICT managers might need to arrange payment. In many organisations, the principal has the last word.

In short, it is easy for many people to get involved when something does not work. If possible, problems should be solved on the spot, trying to avoid including unnecessary people. Escalating problems which could be solved locally quickly becomes costly. Many enquiries are easy to deal with there and then, but other requests involve more complex problems which involve more people. If additional or external help is needed to solve the problem, this must as a rule be clarified with the operations manager. The important thing is to be aware of these points when handling operating events, so as to use resources appropriately.

=== Key points ===

We have sat up some key points for handling incidents. These points can be helpful in evaluating whether or not things are going well by using measurable and well-defined requirements. Such measurement points are:

 * Total number of operational incidents.
 * Average time from receiving an inquiry to when the issue is resolved, classified with codes (a well organized operation department has codes for different types of events and errors).
 * Percentage of incidents handled within agreed response time (as agreed in the service level agreement).
 * Average cost for each event
 * Percentage of incidents solved by the service desk without escalation
 * Events per client machine (workplace)
 * Number and percentage of incidents solved by the operations center without the need for visits to school

=== Tools ===

A number of tools can make it easier to handle operational incidents.

 * Automatic logging
 * Automatic routing of events to the right persons
 * Automatic retrieving of data from the database for configuration management
 * Phone and email are used in conjunction with tools for registering requests and incidents.

== Problem Management ==

Problem management is an &quot;investigative&quot; process. Known bugs are most often handled directly by the service desk. This is the most common form of event handling. To investigate unknown errors requires both common sense and instinct. Good operating people use instinct to go straight to the problem, find the solution and restore service as quickly as possible so that everything works normally.

'''Problem management is;'''

 * Problem management
 * Checking errors
 * Proactive control to prevent problems
 * Identify error patterns, using information from, for example, event management

'''Problem control'''

 * Identify problems
 * Classify problems
 * Examine/research problems

'''Error control'''

 * Identify and register known errors
 * Find temporary solutions if possible
 * Contacting those with responsibility for Change Management to remove the error permanently

'''Proactive control'''

 * Identify and solve problems and errors before the incident is reported by users.
 * Use logs and information from event handling to see how problems may arise

=== Procedures for problem management ===

The Skolelinux/Debian Edu manual is a comprehensive collection of solutions for solving problems and configuring systems. Everything is on the Debian wikipedia pages. Solutions are maintained with the help of staff in schools, municipal ICT services, professional individuals and volunteers. See links to the English pages: https://wiki.debian.org/!DebianEdu/Documentation/Manuals The pages are being translated to Norwegian bokmål. We are working to link the pages to bokmål too.

The Wiki technology has proven to be a great success for maintaining catalogued information on the internet. It's easy to contribute to and all changes are logged. It is also possible to import !OpenOffice.org documents, and export documents as PDF.

== Configuration Management ==

The resources spent on IT systems in schools must be handled in a financially prudent manner in order to control the services used and the equipment / infrastructure. The equipment, software and services have a whole range of settings - this is configuration, or a logical model of how infrastructure and services are set up.

To manage configuration it must be identified, saved and maintained. One must also be able to keep track of different versions of the configurations. We call each part of a setup for a Configuration Item (CI). A configuration file may, for example, ensure that certain users have access to a few printers in the network. Another can make sure you get a buffer on diskless clients.

An updated database for configuration management is essential to ensure rapid and controlled handling of operational issues, or changes in the layout of machines, programs or services.

=== Planning ===

It takes planning to set up a database for configuration management. One must decide in which areas to use the system, the objective, policies and processes for storage and maintenance of configurations.

 * Identify and select a structure for configuration according to the important parts of the ICT infrastructure. Configuration owners, name tags (attributes), dependencies, and relations between configurations all need to be considered.
 * Only approved configurations are managed in the database through the lifetime of the system. Control over access to the configurations can be done with group permissions, and can be done through the process of Change Management.
 * Status logging - keeps track of the condition and status of the various subsystems. This applies throughout the lifetime of the service, software or hardware. There may be a configuration in production, disconnected or discontinued.
 * Checking and revision. Each configuration must be checked to confirm that the correct information is stored in the configuration database (CMDB). This is followed up with periodic reviews to ensure that the database is up to date.

As we see, there is a lot of planning needed in order to have configuration management in the IT system. The purpose of planning as part of IT operations is to ensure that systems are fixed quickly when they go down. With a good configuration management, it is easy to replace a defective machine with a new one. The configurations can be quickly transferred to the new computer and the IT system functions just as well as before.

=== Management of Configuration Items (CI) ===
Line 348: Line 274:
For å få dette ned på jorda så kan vi tenke oss konfigurasjonen til utskriftstjeneren. Man ønsker å legge til en ny skriver i datanettet, og vil legger til denne i utskriftssystemet CUPS. Da endrer man i konfigurasjonen gjennom en nett-applikasjon eller via oppsettet i KDE. Konfig-fila til CUPS vil endres, og man må starte utskriftstjeneren på nytt. Dette kan gjøres i KDE-verktøyet eller via nett-applikasjonen. Den endrede oppsettfilen kopieres til en filkatalog der fila kan håndteres av et versjonsystem. To get this down to earth we can imagine the configuration of the printer server. You want to add a new printer to the computer network and will add this to the printing system CUPS. When changing one configuration through a web application or via configuration in KDE. CUPS config file will change, and you must restart the printer server again. This can be done in KDE tools or through a web application. The modified setup file is copied to a directory where the file can be handled by a version system.
Line 352: Line 278:
Man bør være varsom med å endre konfigurasjonene uten en skikkelig plan. Det er lett å glemme hva man har gjort på en tjenermaskin eller en PC. Derfor er det viktig med dokumentasjon av endringene som er gjort i en endringslogg.

=== Planlegging og installasjon ===

Konfigurasjonen av datanettet henger sammen med arkitekturen. Mye av planleggingen er gjort med Skolelinux. Dette skyldes at det fort tar både 3 og 4 uker å sette opp tjenermaskiner med tilsvarende tjenestenivå med Windows server eller RedHat og andre GNU/Linux-distribusjoner. Med Skolelinux tar dette 1-2 timer. Skal man ha fast IP-adresse på nettverket bruker en fagperson ½ time ekstra på dette. Det skyldes at nett-tjenestene er satt opp med gjenbrukbare navn.

Det som da må planlegges er hvilke ekstra brukerprogram som skal opp, og hvilke delsystemer som skal samvirke med Skolelinux. Det kan f.eks. være at skolen har elektronisk tusjtavle (eng. Whiteboard).
One should be cautious in changing configurations without a proper plan. It is easy to forget what you have done on a server or a PC. Therefore it is important to document the changes made in a change log.

=== Planning and installation ===

The configuration of the computer network is connected to the architecture. Much of the planning is done with Debian Edu. This is because it may take both 3 and 4 weeks to set up servers with corresponding service level with Windows server, !RedHat or other GNU/Linux distributions. Debian Edu takes this with 1-2 hours. If you want a fixed IP address for the network a professional uses ½ hour extra on this. This is because web services are set up with reusable names.

What then must be planned is which additional user program to use, and which subsystems should interact with Debian Edu. It may, for example. be that the school has an electronic whiteboard.
Line 362: Line 288:
Vi har laget en liste med aktiviteter og løsninger som må på plass skal man ha god styring på konfigurasjoner.

 * Etabler en versjonshåndtert område for lagring av konfigurasjoner til alle tjenermaskiner og utvalgte arbeidsstasjoner og bærbare. Man kan bruke versjonsystemet subversion til dette. Husk å ta daglig backup av område, og sørg for å lagre alle endringer i konfigurasjonene.
 * Bruk et elektronisk system for å ta vare på oppskrifter som forklarer konfigurasjonene til forskjellig type maskiner, nettverket og tjenester. Slike oppskrifter bidrar til at andre som hjelper eller overtar driften kan lese seg opp på hva som er gjort. En wiki kan være passende til dette.
 * Bruk en bestemt utgave av operativsystem og programvare på alle maskiner. Dette for å unngå å vedlikeholde mange forskjellige utgaver av programvaren. Sørg for at programvaren er godt testet. Derfor kan det være lurt å vente i 6-12 måneder før man tar i bruk nyeste utgave av et program.

=== Relasjoner til andre prosesser ===

Styring av konfigurasjoner henger nøye sammen med håndtering av problemer og om systemene er tilgjengelige. Opplever man stadig vekk at utskriften stopper, så kan det hende det hende en endring av konfigurasjonen løser problemet. Det kan f.eks. være å få på plass en rutine for sletting av utskriftskøen og starte utskriftstjenesten på ny.

Målsetningen med endringene man gjør i konfigurasjonene er som regel å øke tilgjengeligheten til tjenester eller programmer. Det kan også være for å begrense tilgangen til enkelte programmer eller tjenester til bestemte tidspunkt. For å få dette til må man endre oppsettet til tjenesten. I tillegg kan det koste penger ut over det som er avtalt om tjenestenivå, eller kapasiteten til systemet.

Eksemplene viser at håndtering av konfigurasjoner griper inn i en rekke andre områder. Derfor er det mye å tjene på å få på plass gode arbeidsrutiner for håndtering av endringene som gjøres i konfigurasjonene. Også automatisering er lurt om man ønsker økt stabilitet, eller tilgang til bestemte tjenester i bestemte perioder.
We have made a list of activities and solutions that are important in good configuration management.

 * Establish a version-controlled area for saving configurations for all servers and selected workstations and laptops. Git and SVN are often used for this. Remember to take daily backup of the area, and make sure to save all changes in configurations.
 * Use an electronic system for taking care of recipes explaining configurations of different type machines, the network and services. Such recipes contributes to others who help or take over operations can read up on what is done. A wiki can be suitable for this.
 * Use one specific version of the operating system and software on all machines. This is to avoid maintaining many different versions of the software. Ensure that the software is well tested. Therefore, it may be wise to wait 6-12 months before adopting latest edition of a program.

=== Relations to other processes ===

Management of configurations are closely connected with the handling of problems and if the systems are available. If printing stops to often, it may be that a configuration change solves the problem. It may, for example, be to establish a routine for deleting the print queue and restart the print service anew.

The aim of the changes you make in the configurations are usually to increase the availability of services or programs. It may also be to restrict access to certain programs or services to specific times. To achieve this, one must reconfigure the service. In addition, it may cost money beyond what was agreed on as service level or capacity of the system.

The examples show that the managing configurations engages a number of other areas. Therefore there is much to gain by putting in place good practices for managing changes in configurations. Also automation is advisable if you want greater stability, or access to certain services in specific periods.
Line 378: Line 304:
Som nevnt under huskelisten kan man bruke:

 * Lagring av konfigurasjonsfiler i et system for versjonskontroll, f.eks. subversion
 * Wiki for lagring av dokumentasjon av oppsett og veivisere
 * Bruk av felles katalog for driftsdokumentasjon på Internett vedlikeholdt av de som sentraldrifter Skolelinux på mange skoler
As mentioned under Check list one may use

 * Saving the configuration files in a version-control system, for example subversion.
 * Wiki for storing documentation of setup and wizards
 * Use a common directory for operational documentation on the internet, maintained by Skolelinux/Debian edu staff in the schools.
Line 388: Line 314:
Det er helt avhengig av skikkelige prosesser ved endringsmeldinger. Dette gjelder uavhengig om endringene er små eller store. Derfor er det viktig å ha på plass riktige personer når man gjør endringer slik at det både er gitt opplæring og det er folk til å svare på spørsmål. Dette blir spesielt viktig når man tar i bruk nye utgivelser av programvare og tjenester. Det har ingen ting med om man bruker fri eller produsenteid programvare.

Endringsledelsen skal sørge for at alle endringer gjøres på en standardisert og riktig måte. Det er viktig å få til beslutning om endring på riktig nivå i organisasjonen, Standard endringer kan ofte være forhåndsgodkjente når de er gjort et par ganger. Men større endringer vil ofte involvere et høyre beslutningsnivå mellom skoleledelsen og driftsoperatør.

Grunnen til at ledelsen skal være med er at endringer i systemet er at en oppgradering ofte vil kreve opplæring av brukerne. Det kan være oppgradering til ny nettleser eller ny utgave av kontorprogrammet. Dette kan fort føre med seg en halv dags opplæring i hva som er nytt i et program. Slike endringer må derfor avklares med ledelsen. Endringene må også gjøres uten at det andre deler av systemet slutter å fungere.

De med ansvar for å godkjenne endringer mottar en mottar en såkalt endringsmelding eller RFC (Request For Change) som er den engelske forkortelsen. Når man har en endringsmelding kan man vurdere om endringen skal utføres. Mange ganger må man avklare med ledelsen om eventuelle endringer skal gjøres, og eventuelt når det skal skje.

Ved endringer må man også samarbeide med skolens IKT-ansvarlig. Man må sørge for at endringene skjer når det passer med skolenes planer. Å gjennomføre betydelige endringer uten endringsledelse kan føre til mye misnøye og ekstra henvendelser til servicekontoret. Da får man betydelig ekstraarbeid uten at dette er planlagt. I tillegg kan det føre til en endring som ville fort må rulles tilbake. Man får fort dobbelt så mye arbeide uten å havne noe annet sted enn tilbake til start. Hadde man sørget for de nødvendige godkjenninger ville endringen kunne gjøres på en planlagt og grei måte.

Endringsledelse gjøres for å unngå mer ekstra arbeide enn hva som er nødvendig. Å gjøre endringer krever selvsagt mer arbeide, men man vil få mindre ekstra arbeide om endringene planlegges. Man unngår også at man må rulle tilbake endringer fordi det oppstår problemer der brukerne ikke er forberedt på betydelig omlegging.

Når man f.eks. oppdaterer hele systemet til ny versjon må man passe på at alle er orientert. Man må undersøke om de som berøres av endringen trenger opplæring. De rette fagpersonene må forberede det hele slik at det ikke oppstår overraskelser.

Men må unngå at alt ansvar havner hos den som står for styring av versjonene av programvaren. Det er den som har ansvaret for håndtering av utgivelser (release manager). Utgivelseshåndteringen er en prosess som fortrinnsvis skal arbeide med endringer som inneholder mange mindre endringer. Dette skjer som regel ved utrulling av nye systemer og tjenester, eller ved oppgradering av hele systemet til ny versjon.

=== Aktiviteter ===

 * Se over endringsmeldingen (RFC) og sjekk at den også har fått et unikt nummer.
 * Prioriter og kategoriser endringer
 * Fjern endringer som ikke er mulige. Dette kan gjøres ved å merke disse som ikke mulig.
 * Gi tilbakemelding til den som gav endringsmeldingen
 * Sørg for at man har en rådgivingsgruppe (eng. Change Advisory Board) der endringen blir tatt opp, diskutert og vurdert. Rådgivningsgruppa kan være utvalgte IT-kontakter og driftspersonell med lang erfaring.
 * Koordiner endringene med den som håndterer forskjellige versjoner av programmer og tjenester (eng. Release Management)
 * Se over og avslutt endringsmeldingen (RFC)
 * Husk å lagre endrede konfigurasjoner i lageret for oppsettfiler
 * Rapporter

Selv hva som kan se ut som en liten ubetydelig endringsmelding kan få omfattende konsekvenser om endringen gjøres. Vi har eksempler på skoler som har et stabilt Skolelinux-nett der alle programmene virker. Så installeres en testutgave av et populært program som kræsjer hele tiden. Skolelinux får skylda.

Et eksempel er skoler som har installert testversjonen av nyeste OpenOffice.org før programmet var endelig ferdig. Flere syntes at det kunne være gøy og prøve ut. Problemet er at testutgaver som regel er utgitt for å finne feil og ustabilitet i programmer. De er ikke ment for bruk i produksjon.

Hovedregel er at man ikke installerer testutgaver av programvare i produksjon. De fleste driftsoperatører anbefaler å bruke nest siste versjon av et program beregnet på produksjon. Etter 6-12 måneder er som regel de verste feilene plukket av en ny hovedutgave av et program.
Change-messages is entirely dependent on proper processes. This applies regardless of whether the changes are small or big. Therefore it is important to have in place the right people when making changes, both to give training and to have people to answer questions. This becomes especially important when adopting new releases of software and services. This is independent of whether one uses free or proprietary software.

Change Management should ensure that all changes are made in a standardized and right manner. It is important to anchor the decision about amending at the appropriate level in the organisation, Standard changes can often be pre-approved when they are done a few times. But major changes will often involve a higher decision level between school management and operator.

The reason why the management should be included is that an upgrade will often require training of users. It may be upgrading to a new browser or a new version of office software. This can quickly lead to a half day training in what is new in a program. Such changes must be agreed with the management. The changes must also be done without the other parts of the system stops working.

Those with responsibility for approving changes receives a so-called change message or RFC (Request For Change). When you have a RFC you can assess whether the change should be performed. Many times you have to clarify with management if optional changes should be made, and if so, when it will happen.

By changes one must also cooperate with the school's ICT responsible. One must ensure that changes occur when it fits with the schools plans. To implement significant changes without Change Management can lead to much dissatisfaction and additional inquiries to the Service Desk. This would provide significant extra work without this being planned. In addition, it may lead to a change that would soon be rolled back. You fast get twice as much work without ending anywhere else than back to start. Had one made the necessary approvals, may the change be done in a planned and straightforward manner.

Change Management is done to avoid more extra work than what's necessary. Making changes obviously requires more work, but you will get less extra work on the changes planned. One also avoids the need to roll back changes, because problems arise where users are unprepared for substantial changes.

When you for example update the entire system to a new version, make sure that everyone is informed. One must look into whether those affected by the change need training. The right professionals must prepare it all, so there are no surprises.

All responsibility must not land on the person responsible for managing versions of software, the release manager. Release handling is a process which preferably should work with changes that contains many minor changes. This usually happens when rolling out new systems and services, or the upgrading of the entire system to a new version.

=== Activities ===

 * See change message, or RFC (Request For Change) above, and check it also has got a unique number.
 * Prioritize and categorize the changes
 * Remove not possible changes. This can be done by marking them as not possible.
 * Give feedback to the one giving the change message
 * Make sure you have a Change Advisory Board, where the change is dealt with, discussed and evaluated. This consulting group can be selected ICT contacts and operations personnel with long experience.
 * Coordinate changes with the Release Management which handle different versions of applications and services.
 * Look over and finish the changing message (RFC)
 * Remember to save modified configurations in the repository for configuration files.
 * Reports

Even what may look like a small insignificant change message can have major consequences for if the change is implemented. We have examples of schools that have a stable Debian Edu network where all the programs work. A test version of a popular program crashing constantly, is installed, and Debian Edu get blamed.

An example is schools that have installed the test version of the latest !OpenOffice.org before the program was finally finished. Several thought it could be fun and try out. The problem is that the test editions are usually released to find errors and instability in applications. They are not intended for production use

In production, the general rule is that you don't install test versions of software. Most operators recommend using the next to latest version of a program intended for production. After 6-12 months are usually the worst errors picked out of a new main version of an application.
Line 424: Line 350:
== Utgivelsesledelse (Release Management) ==

Utgivelseshåndteringen er administrasjon og planleggingsaktiviteter for å klargjøre for ønskede endringer. Endringene kan være små eller store der store endringer kan bestå av mange mindre delendringer. Utgivelseshåndringingen skjer før man setter igang selve jobben med å installere program- og maskinvare i produksjon.

Først gjennomføres planlegging og testing av nye utgivelser. Deretter så rulles det hele ut i produksjon. Utrulling er en del av infrastrukturledelse. Prosedyren er å gjennomføre det som er planlagt, testet, og ligger klar i systemene for konfigurasjonsstyring. Når alt er planlagt, testet og konfigurasjonene er lagret, så ruller man ut løsningen i drift.

Som regel er mange tjenestetilbydere og leverandører involvert. Det gjelder både i forbindelse med anskaffelse av maskiner, den programvaren som brukes, og de konfigurasjonene som er anbefalt. God ressursplanlegging er grunnleggende for å kunne pakke og distribuere en ny utgivelse på en bra måte for brukerne. Slurver man på dette området kan man ende opp med at utstyret ikke virker, eller blir stående ubrukt fordi det er mangler ved installasjonen.

Utgivelsesledelsen tar et helhetlig grep ved endring i en tjeneste, og skal sikre at alle deler av en utgivelse ses i sammenheng. Det gjelder både for tekniske og ikke-tekniske forhold.

=== Grunnleggende ===

Som man ser er utgivelseshåndtering helt grunnleggende for at datamaskiner, programvare og nettverket virker som planlagt. Skikkelig håndtering av utgivelser gjøres for å hindre driftsforstyrrelser. Ved nye utgivelser eller endringer er det forventet at driften skal gå som normalt uten avbrudd eller reduksjon i kvaliteten.

Håndtering av endringer eller nye utgivelser kan sammenlignes med å bygge ny vei. Bilene må fortsatt komme fram selv om man bygger ny vei oppå den gamle. God skilting må være på plass. Man må også ha de nødvendige ressurser til å legge om veien. Mangler man ressurser til å gjøre endringene så er det like greit å la vær.

For noen kan det være kjedelig med skikkelig utgivelseshåndtering. Man får ikke brukt det nyeste nye hver gang det kommer noe nytt. Men som oftest er det ikke satt av ekstra tid i driftsavdelingen for å håndtere en flom av klager når helt ny programvare svikter. Linux-eksperten David Elboth slår fast at høye oppetider krever etablert teknologi. I LINUXmagasinet (1/2004) skriver han:

 * Desto høyere krav desto strengere blir kravene til de enkelte komponentene. Høye krav til oppetid resulterer også at valgene du står igjen med er gammel teknologi. Det er nemlig erfaringsmateriale over tid som kan si noe om nedetid. Vi har alle lagt merke til hvor lenge etter Red Hat og SuSE ligger på sine serverprodukter.

Getting few complaints, with a stable and reliable environment, requires solid release handling. Alternatively, a bunch of complaints and dissatisfied users emerge, when installing not good enough tested cutting edge software. People with &quot;boy room skills&quot; has a tendency to underestimate the consequences of software upgrade. If something goes fine on your home computer, it does not mean that this will work in a wide network with 500 client computers and 3200 users.

=== Sentralt programarkiv (DSL) ===

Programarkivet i driftssammenheng er en samling av originalutgaven av den programversjonen av programvaren som er i produksjon. Bruker man Skolelinux 2.0 er det dette som er programarkivet. I dataverdenen brukes ordet programarkiv i flere sammenheng, spesielt når man programmerer. Når det gjelder drift snakker vi den originale sammensatte programvare av en bestemt versjon som er utgangspunktet for installasjonen.

Bruker man fri programvare kan programarkivet være Skolelinux 2.0 pluss de ekstra programmene man har lagt inn i tillegg fra forskjellige kilder. Det kan være bestemte versjoner av Macromedia Flash, Java og decodere som gjør det mulig å kjøre nasjonale prøver i nettleseren, eller se sendinger fra NRK.

Har man planlagt og oppgradere til neste versjon av Skolelinux når den har kommet, vil det være den nye versjonen som blir hoved programarkiv. Også her vil alle ekstra programmer ut over ny Skolelinux være en del av arkivet.

Oppsettfiler som er justert eller laget lokalt av driftsavdelingen er følger ikke med som en del av hovedarkivet for programmer. Konfigurasjoner lagres i en egen versjonshåndtert katalog eller database.

=== Database for konfigurasjoner og maskinvare ===

Som nevnt under kapitlet med konfigurasjonsstyring må man opprette en database eller en versjonshåndtert katalog for å ta vare på oppsettfiler. Man må også ha oversikt over alle datamaskiner, hva slags maskiner det er snakk om, ytelse, og unike standardadresser på nettverkskortene (MAC adresser).

Det er mange grunner til å ha oversikt over utstyret. En av hovedgrunnene er å ha oversikt over hvor mange maskiner som er i drift, antall maskiner som ikke er i bruk, og antall maskiner på reparasjon. En annen grunn går på planlegging ved oppgraderinger. Det går både på hvor mye

=== Bygg-håndtering ===

I skolen installeres en rekke program i tillegg til nettleser og kontorprogram. Det trengs pedagogiske program for læring, tilleggsprogram i nettleseren, og det trengs program for multimedia. Systemene har også nettverksoppsett og endrede innstillinger i bestemte programmer. Har man mange tjenermaskiner og kanskje tusenvis av klienter ser man raskt at det er behov for effektive verktøy for utrulling. Slike verktøy er standard i Skolelinux.

Bygg-håndtering handler om å få installert de ønskede programpakkene, tjenestene og riktig innstillinger både av enkelte program og datanettverket. Mange har hørt om å bygge såkalte «images». Man installerer operativsystem og alle de programmene man trenger. Stiller inn nettverket. Deretter bruker man et image-program for å lage en kopi av det som er installert på harddisken. Dette kopieres så ut til de andre datamaskinene.

Det er slett ikke nødvendig å bygge såkalte «images» eller diskbilder man kan kalle det på norsk. Skolelinux bygger på Debian som har et utmerket pakkesystem. Man trenger på ingen måte å kompilere programmer da dette er ferdig satt sammen, og kan installeres rett fra Internett. Det man må ha orden på er ønskede endringer i standardoppsettet til Skolelinux eller hoved programarkivet som er i bruk. Deretter lager man et eller flere skript som kan kjøre på de forskjellige maskinene for å få alt installert og satt opp.

For de fleste situasjoner er skripting en enkel måte å «bygge» og rulle ut programmer og oppsett. Men det er situasjoner der bygging av diskbilder kan være løsningen. F.eks. ved installasjon på mange bærbare datamaskiner.

Som vi ser handler håndtering av bygg-prosessen om å tilrettelegge for utrulling på mange datamaskiner. Helt unntaksvis handler det om å bygge en skreddersydd Debian-pakke. Men i de aller fleste situasjoner er alt pakket ferdig. Da må man få på plass et skript for som installerer ekstra programmer og bestemte innstillinger. Man kan også lage diskbilder om man har mange like maskiner, som f.eks. bærbar PC til alle elevene.
== Release Management ==

Release handling is management and planning activities preparing for wanted changes. The changes can be small or large, where large changes can consist of many smaller changes. Release management goes on before initiating the actual job of installing software and hardware into production.

First the planning and testing of new releases are carried out. Then it all is rolled out it into production. Deployment is part of the infrastructure management. The procedure is to implement what is planned, tested and is ready within the systems for Configuration Management. Once everything is planned, tested and configurations are stored, then roll out the solution in production.

Usually, many service providers and suppliers are involved. This applies both to the acquisition of machines, the software used, and the recommended configurations. Good resource planning is crucial to package and distribute a new release in a good way for users. Cutting corners in this area can lead to equipment that doesn't work, or that goes unused because of deficiencies in the installation.

Release Management takes a comprehensive approach by the change in a service, and ensure that all parts of a publication is seen in context. This applies to both technical and non-technical factors.

=== Basic ===

As you can see, for computers, software and network to work as planned, release-management is crucial. Proper handling of releases prevents disruptions. New releases or changes can be introduced while operations continue as normal, without interruption or reduction in quality.

Implementing changes or new releases can be compared to building a new road. Cars must still get past even if you build a new road on top of the old. Good signs must be in place. One must also have the necessary resources to rebuild the road. If you lack the resources to make changes, it's better to let it be.

Some might think that proper release management is boring as one doesn't get to implement the latest version every time something new is released. But often the operations department lacks the resources to handle a flood of complaints should an upgrade fail. High uptime requires established technology, as said by Linux expert David Elboth in the Linux Magazine (1/2004). He writes:

 * The more you demand of the system the more stringent are the requirements of the individual components. High requirements for uptime results also show that the choices you are left with are old technology. Only empirical data over time can say anything about downtime. We have all noticed how far behind are Red Hat and !SuSE with their server products.

To get few complaints, with a stable and reliable environment, requires solid release management. Alternatively, a bunch of complaints and dissatisfied users emerge, caused by installing insufficiently tested cutting-edge software. Amateurs have a tendency to underestimate the consequences of software upgrades. If something works fine on your home computer, it does not mean that this will work in a wide network with 500 client computers and 3200 users.

=== Definitive Software Library (DSL) ===

A software archive in an operational context is a collection of original copies of the software in use. If you use Skolelinux 2.0, this is the software package. The phrase software archive is used differently in some other contexts, especially among programmers. When it comes to operations, we would be talking about the original software package of a particular version which is used for the installation.

By using free software, the software archive may be Skolelinux 2.0 plus the extra programs you have added from various sources. There may be certain versions of Macromedia Flash, Java and decoders which make it possible to run national tests in the browser, or to watch broadcasts from a national TV station.

If you plan to upgrade to the next version of Debian Edu when released, this new version shall be the main program archive. The new archive shall also include appropriate versions of all additional applications beyond Debian Edu.

Set-up files customized or created locally by the operations department are not included in the main program archive. Configurations are saved separately in a version-control system or database.

=== Database for configurations and hardware ===

As mentioned in the chapter on configuration management, you must create a database or a version-controlled directory to take care of set-up files. One should also keep track of all computers, what kinds of machines are in use, performance, and unique standard addresses on the network cards (MAC addresses).

There are many reasons to have an overview of the equipment. One of the main reasons is to keep track of how many machines are in operation, how many are not in use and how many are being repaired. Another reason is planning for upgrades.

=== Build management ===

A variety of applications in addition to browser and office suite are installed in schools. Educational programs for learning, browser plug-ins, and programs for multimedia are needed. The systems also have network set-up and changed settings in specific programs. When you have many servers and perhaps thousands of clients, the need for effective tools for deployment, soon makes itself felt. Such tools are standard in Debian Edu.

Build management is about ensuring that you always install the required software packages, services and proper settings both of individual programs and for the network. Many people have heard about the so-called &quot;images&quot;. One installs the operating system with all needed programs and configures the network. Then one uses an image program to make a copy of the hard disk. This &quot;disk image&quot; can then be copied to other computers.

It is not necessary to build such disk images. Debian Edu is based on Debian which has an excellent package management system. There is no need to compile applications, as ready-made packages can be installed directly from the Internet. It is enough to work out what changes you want to the default set-up of Debian Edu or the main program archive in use. Then you make one or more scripts to run on each machine that get everything installed and set up.

For most situations, scripting is an easy way to &quot;build&quot; and roll out programs and configurations. But there are situations where building disk images may be the solution, e.g. for installation on many laptops.

As we see, handling the construction process is about facilitating deployment on many computers. In exceptional cases, this may involve building a tailor-made Debian package. But in most situations, everything is ready-packaged. Then you have to put in place a script which installs additional programs and certain settings. One can also create disk images if you have many similar machines, such as laptops for all students
Line 476: Line 402:
Det er helt avgjørende å teste nye programmer, konfigurasjoner, og nye tjenester før de settes i produksjon. Flere skoler har erfart ustabilitet fordi de har installerer programvare uten å gjøre de nødvendige justeringene. Derfor er det helt avgjørende å teste endringer i konfigurasjoner eller ny versjon av programvaren før endringen gjøres på alle maskinene.

Testing skjer gjerne i tre steg.

 * First, do an installation of the changes on a test network. This is technically testing guaranteing that everything connect and works in a system without users. Retain all changes in configuration files.
 * Når man er sikker på at alt virker på den tekniske siden prøveinstalleres løsningen på en skole. Det er svært viktig å avtale testingen med skolens IT-kontakt. Brukerne må også få full orientering om at det vil bli endringer fordi man utfører testing. Ta vare på aktuelle justeringer i oppsettfiler som er gjort underveis ut fra de driftsmeldingene som har kommet.
 * Når man er sikker på at alt virker kan man rulle ut løsningen til alle skolene. Det gjøres enklest med å lage et skript som forenkler oppgradering av programpakker, tjenester og konfigurasjoner.

=== Reserveløsning ===

Mye kan gå galt under en ny installasjon eller oppgradering. Derfor må man ha klar en reserveløsning. Det betyr at man på kort tid kan ta i bruk systemet slik det var før oppgraderingen. På fagspråket heter dette tilbakerulling.

Når man skal rulle tilbake er det helt avgjørende å ha klar forrige versjon av arkivet for programvare og oppsettfiler. Det betyr at man kan installere f.eks. Skolelinux 1.0 på under en time, og legge på plass aktuelle konfigurasjonsfiler.

Men tilbakerulling tar tid. Derfor kan det være greit å ha en tjenermaskin klar med forrige utgave av programvaren, de riktige konfigurasjonene, og hjemmekatalogene til brukerne. Denne tjeneren kan raskt erstatte maskinene som ble oppgradert, men ikke virket etter planen. Ved å ha tjenermaskin(er) i reserve kan man sørge for høy tilgjengelighet selv om noe skulle gå galt.

=== Fordeler og mulige problemer ===

Fordelen med å ha arkiv over programvaren som er i produksjon kan ikke undervurderes. Mange satser på å ha programvaren på sine respektive CD-er og enkelte DVD-er. Dette gir lite effektiv distribusjon. For å spare tid og bryderi er all programvaren i Skolelinux tilgjengelig på Internett.

Driftsavdelingen kan lage kopi av Skolelinux-arkivet på en sentral tjenermaskin. Herfra kan all programvaren raskt og greit installeres på de andre maskinene. Fordelen med dette er at IT-tjenesten hele tiden har oversikt over hvilke versjoner av programvaren som de har gjort tilgjengelig for skolene. Man hindrer også installasjon av programvare som ikke har vært vurdert av styringsgruppa for endringer.

Det kan oppstå betydelig problemer om man ikke vedlikeholder programarkivet og konfigurasjoner. Det kan også være at man gjør feil med en konfigurasjonsfil eller programpakke. Da rulles dette ut til alle maskinene. I tillegg kan enkelte skoler installere lite testet programvare eller beta-program som de setter i produksjon. Så man må ha gode prosesser og ha noen å holde ansvarlige for vedlikehold av programarkivet og konfigurasjonene.

Is it needed a lot extra to install and maintain services and software already in use? However, if you choose away the tools providing management with upgrades you give yourself a lot of extra work. The ICT service must spend a lot of time on manual work with installation on each machine. The danger of making mistakes increases. When things do not work you get disgruntled users, and much time is spent to debugging.
It is essential to test new applications, configurations, and new services before they are put into production. Several schools have experienced instability when they have installed software without making the necessary adjustments. Therefore it is crucial to test changes in configurations or new versions of the software before the change is made on all machines.

Testing generally takes place in three steps.

 * First, do an installation of the changes on a test network. This is technical testing to check that everything hangs together in a system with few users. Take care to include all changes in configuration files.
 * When you are sure that everything works on the technical side, try installing the solution in one school. It is very important to agree about the testing with the school's ICT contact. Users must also be fully briefed on changes made for the sake of testing. Take care to preserve current adjustments in the set-up files, which may have been made in the course of normal maintenance.
 * When you are sure everything works, you can roll out the solution to all schools. It is easiest to create a script that simplifies upgrading of software packages, services and configurations.

=== Fall-back solution ===

Much can go wrong during a new installation or upgrade. Therefore, one must have ready a fall-back solution. This lets one quickly get back to the system as it was before the upgrade. In technical terms, this is called roll-back.

When rolling back it is absolutely essential to have ready the previous version of the software archive and configuration files. This means that you can install for example Edu 1.0 in under an hour, and put in place the appropriate configuration files.

But roll-back takes time. Therefore, it may be prudent to have a server ready with the previous version of the software, the right configurations, and a recent copy of the users' home directories. This server can quickly replace any machines on which the upgrade does not go according to plan. Having server machines in reserve can ensure high availability even if something goes wrong.

=== Advantages and possible problems ===

The advantage of having records of the software in production can't be underestimated. Many rely on having the software on their respective CDs and DVDs. This is inefficient distribution. To save time and trouble all the software in Debian Edu is available online.

Your operating department can create a copy of the Debian Edu archive on a central server. From here, all the software can quickly and smoothly be installed on other machines. The advantage is that your ICT service has a constant overview of the versions of the software they have made available to schools. This also prevents the installation of software that has not been considered by the Change Management.

There may be considerable problems if you do not maintain the software archive and configurations. It can also lead to mistakes with a configuration or software package. Then this gets rolled out to all machines. In addition, some schools may install insufficiently tested software or use beta releases in production. So one must have good processes and have someone to take responsibility for maintenance of the program archive and configurations.

It may seem like one needs a lot of extra things in place in order to install and maintain the services and programs that are in use. However, if you skip the tools that provide management of upgrades, you give yourself a lot of extra work. The ICT service must spend a lot of time on manual installation on each machine. The danger of making mistakes increases. When things do not work you get disgruntled users, and much time is spent fixing problems.
Line 504: Line 430:
=== Planlegging og implementasjon ===

Årsaken til at man planlegger før man gjennomfører endringer er for å hindre uker eller ekstra måneder med problemer. Selv om man skulle bruke noe ekstra tid på planlegging, så tjenes dette raskt inn fordi man unngår ekstra problemer. Det vil alltid være personer som forteller at de ikke har hatt problemer med ad-hoc-endringer i systemene. Men når man undersøker nærmere viser det seg at det er problemer etter endringer, og at henvendelser om dette ikke formidlet videre.

I våre øyne er ad-hoc-løsninger kun en omvei ved endringer, og kun en nødløsning. En ad-hoc løsning kan sammenlignes med en midlertidig reparasjon med «ståltråd og tape». Man må på sikt rydde opp i slike løsninger når man vil ha stabil drift uten stadige overraskelser. Ved å hoppe over en planleggingsfase vil man få mange flere ad-hoc-løsninger, og flere driftsproblemer ved endringer eller oppgraderinger. Derfor er det helt avgjørende at fagfolk og ledelsen forstår verdien av en god planprosess. endringer.

Derfor anbefaler vi at man innkaller til planmøte, og lager en stegvis plan ved endringer av systemet. En stegvis plan vil selvsagt variere i forhold til hva som skal endres. Det å oppgradere kontorprogrammet OpenOffice.org er noe annet enn å oppgradere hele systemet. Ved oppgradering til nytt kontorprogram holder det kanskje med en 2-3 timers gjennomgang av kontorpakken for læreren på hver skole. Når man skal oppgradere hele systemet må man både sørge for brukeropplæring og at det tekniske fungerer etter forutsetningene.

Hovedpoenget er at det er få snarveier når det kommer til planlegging og implementasjon. Undersøkelser viser at de som planlegger skikkelig og sørger for at folk har riktig kompetanse har lavere driftskostnader knyttet til driften.

=== Aktiviteter ===

Det er helt avgjørende å planlegge nye utgivelser. De fleste endringer av systemet skal avklares med ledelsen. Følgende liste over aktiviteter er laget som støtte ved oppgraderinger i en plan- og gjennomføringsfase.

<table>
<tbody>
<tr class="odd">
<td align="left">'''Oppgaver'''
</td>
<td align="left">'''Detaljer'''
</td>
</tr>
<tr class="even">
<td align="left">Prioritering av utgivelsen:
</td>
<td align="left">Sjekk om nødvendige beslutninger er gjort før en endring eller oppgradering skal rulles ut.
</td>
</tr>
<tr class="odd">
<td align="left">Sentralt programarkiv
</td>
<td align="left">Sørg for at de aktuelle programpakker som ønskes installert er på plass i det sentrale programarkivet.
</td>
</tr>
<tr class="even">
<td align="left">Konfigurasjonsdatabase
</td>
<td align="left">Sørg for å ha på plass alle oppsettfiler. Det gjelder både de som er i bruk, og de nye som følger med systemene som endres eller oppdateres.
</td>
</tr>
<tr class="odd">
<td align="left">Bygg-håndtering
</td>
<td align="left">Alle skript og systemer som brukes til utrulling eller å lage diskbiler (images) må på plass.
</td>
</tr>
<tr class="even">
<td align="left">Testing
</td>
<td align="left">Kjør først utprøving på testutstyr. Når dette fungerer uten problemer så kan det prøves ut med en skole. Skolen må være fullt orientert om, og med på at de skal prøve ut ny programvare. Når man er sikker på at alt virker kan man oppgradere hos alle.
</td>
</tr>
<tr class="odd">
<td align="left">Reserveløsning
</td>
<td align="left">Selv med omfattende testing kan nye utgivelser gå galt. Derfor er det avgjørende å ha en reserveløsning. Den enkleste reserveløsningen er å ha den gamle installasjonen med data på en egen tjenermaskin. En slik maskin kan plugges inn om endringen eller oppgraderingen ikke virker.
</td>
</tr>
</tbody>
</table>

=== Verktøy ===

Som man ser av aktivitetslisten trenger man flere verktøy for å holde orden på forskjellige utgivelser av programvaren, tjenester og maskinvare i systemet. Noen av disse verktøyene er nevnt tidligere. Men vi gjentar dette allikevel:

 * Debian-verktøy for sentralt programarkiv
 * Database for konfigurasjoner og maskinvare (subversion for oppsettfiler, regneark med oversikt over all maskinvare med fysisk plassering)
 * System for bygghåndtering
 * Maskinvare for testing og reserveløsning

=== Relasjoner til andre prosesser ===

Utgivelsesledelse griper rett inn i kjernen til IT-tjenesten. Det går på å gjennomføre ønskede sikkerhetsoppdateringer, endringer i tjenester, eller og oppgradering av dataprogram. Forespørsler om nye utgivelser kan skyldes driftsproblemer eller ønske om ny programvare. Før en ny utgivelse så er det gjort en vurdering om endringen er ønskelig.

Om endringen er grei så vil man gjøre nødvendige endringer i konfigurasjoner og klargjøre programpakker for utrulling. Dette vil være testet, og man vil ha på plass reserveløsninger. Når endringene er utført vil man kanskje måtte legge om deler av driftsaktiviteten. Så det er enkelt å se at endringshåndtering påvirker alle deler av driftsstøtten.

== Verktøy for driftsstøtte ==

Det første man skal spørre seg selv om: «trenger vil virkelig programvareverktøy?» Trenger man verktøy så er det avgjørende å undersøke alternativene grundig.

Tar man en glanset brosjyre, og lytter til salgsprat, så er man helt avhengig av slike verktøy. Men gode folk, gode prosessbeskrivelser, og gode prosedyrer og arbeidsbeskrivelser er et grunnlag for god tjenestestyring. Behovet for, og hvor kompliserte verktøyene er, er avhengig av virksomhetens behov for datasystemer, og størrelsen på organisasjonen.

I en liten organisasjon vil en enkel fritt tilgjengelig database være nok for logging og styring på hendelser (request tracker). Men i større organisasjoner vil man ganske sikkert ha behov for et sofistikert distribuert og integrerte verktøy for tjenestestyring. Det betyr at man linker alle prosesser til et system for hendelseshåndtering.

Selv om verktøy kan være viktig, så er ikke disse viktige i seg selv. Det er de oppgaver og prosesser som må gjøres, og informasjonen som det er behov for som er utgangspunktet. Dette vil gi nødvendig informasjon til en spesifisering for hvilke verktøy som passer best til å støtte driften. Her er noen grunner til hvorfor man kan bruke programvare til driftsstøtte og tjenestehåndtering:

 * økte krav fra brukerne
 * mangel på IT-kunnskap
 * budsjettbegrensninger
 * virksomheten er helt avhengig av kvaliteten på tjenesten
 * integrasjon av systemer fra flere leverandører
 * økt kompleksitet i IT-infrastrukturen
 * fremvekst av internasjonale standarder
 * økt omfang og endringer innen IT

Automatiske verktøy tillater:

 * sentralisering av nøkkelfunksjoner
 * automatisering av funksjoner i tjenesteleveransen
 * analyse av data
 * identifisering av trender
 * preventive tiltak kan implementeres

=== Type verktøy ===

I dette kapitlet har vi forestått en rekke verktøy for å forbedre driftsstøtten. Her følger en oppsummering av verktøyene:

 * Debian-verktøy for sentralt programarkiv
 * Database for konfigurasjoner og maskinvare (subversion for oppsettfiler, regneark med oversikt over all maskinvare med fysisk plassering)
 * System for bygghåndtering
 * Maskinvare for testing og reserveløsning
 * Hendelseslogger (Request Tracker)
 * System for overvåking (Munin)

Etter som driftsavdelingen får mer erfaring med systematisk drift vil det lages, eller skaffes flere typer verktøy.

=== Evalueringskriterier ved valg av verktøy ===

Selv om det er brukt store beløp på å lage evalueringskriterier for programvare, så finnes ikke annet enn erfaringsbaserte retningslinjer. Det er ingen endelige svar på hva som er god eller mindre god programvare. Som med mye annet dreier en del seg om smak og behag. Flere løsninger gjøre samme jobben like godt, men kan ha ganske forskjellig utforming. Men det er noen tommelfingerregler som kan være nyttige å ta med seg.

Det viktigste evalueringskriteriet er om man har behov for å gjøre en jobb i det hele tatt. Mange IT-verktøy er helt perfekt og løser sine oppgaver uten feil, men det løser oppgaver ingen trenger å ha løst. Så det viktigste kriteriet er om man løser riktig problem, og om det i det hele tatt er nødvendig å gjøre noe som helst.

 * Så det første man spør om er om verktøyet er ønsket.

Om det viser seg at man vil ha løst en oppgave, kan det vise seg at løsningen er så enkel at det er like greit å kjøre noen kommandoer for hånd. Det enkleste er gjerne det beste. Men når man får mange maskiner å drifte blir automatisering helt avgjørende. Det blir for mye jobb å logge seg inn på 20 likeartede tjenermaskiner for å gjøre en sikkerhetsoppgradering. Da er automatisering tingen.

 * Så her må man spørre om verktøyet er nyttig til å løse oppgaven.
 * Deretter må man spørre om verktøyet er brukbart.

Det er ofte et stort utvalg av programmer og fremgangsmåter for å løse et bestemt oppgave. Men en del problemer løses helt annerledes når man vedlikeholder 500 datamaskiner og 11 tjenermaskiner enn når man fikser hjemme-PC-en. Et eksempel kan være verktøy for at læreren kan se skrivebordet til hver enkelt elev på sin klientmaskin. Læreren kan stoppe og starte programmer hos alle elevene, og hindre enkeltelever å bruke f.eks. lynmeldinger når dette forstyrrer skolearbeide.

Når det gjelder valg av driftsverktøy handler det om automatisering og forenkling av driftsoppgaver. Det er om å gjøre og få redusert manuelt arbeide til et minimum. Så motivasjonen er å kun vedlikeholde automatikken. Også her går det på å gjøre ting enkelt, noe som kan være en betydelig jobb å få til.

Som man ser er det slett ikke enkelt å sette opp gode kriterier for valg av driftsverktøy for store installasjoner. Mest av alt kan dette skyldes at utviklere av programvare ofte mangler erfaring fra drift av IT-systemer. De er kun kjent med å lage nye ting, og det å lage gode og relevante verktøy for drift krever mange års erfaring.

En del generelle driftsverktøy som ikke har vært byttet ut de siste 20 årene. Men de produktene som brukes kan være byttet ut. Også noen programmer kan om få år være uaktuelle å bruke. Derfor må man belage seg på trening i nye utgaver av programmene som brukes til drift, eller ved oppgradering og endringer i brukerprogram.

=== Produkttrening ===
=== Planning and implementation ===

The reason for planning before implementing changes is to avoid weeks or months of delay due to problems. The time used for planning is quickly regained because one avoids additional problems. There will always be people who say they have had no problems with ad hoc changes in the systems; but closer examination reveals that there are problems after such changes, they merely don't get communicated.

In our eyes ad-hoc solutions are only a detour through changes, and only an emergency measure. An ad-hoc solution is like a temporary repair with &quot;wire and tape.&quot; One must in due course clean up such solutions to ensure stable operation without constant surprises. Skipping a planning phase leads to many more ad hoc solutions, and several operational problems when changes or upgrades are done. Therefore it is essential that professionals and management understand the value of a good planned process for changes.

Therefore, we recommend that you convene a meeting for planning, and make a stepwise plan for changes in the system. A stepwise plan will naturally vary according to the change. Upgrading the !OpenOffice.org suite is quite different from upgrading the whole system. When upgrading to a new office application, a 2-3 hour tour of the office suite may be enough for the teacher in each school. When upgrading the entire system one must both provide user training and test that the technical details work as intended.

You'll find few shortcuts is the main point when it comes to planning and implementation. Studies show that those who plan properly and ensure that people have the right skills, have lower operating costs for the operation.

=== Activities ===

It is crucial to plan new releases. Most modifications of the system should be clarified with the management. The following list of activities is designed to support the upgrades in a planning and implementation phase.

||'''Tasks'''||'''Details'''||
||Prioritization of the release:||Check if the necessary decisions are made before a change or upgrade would be deployed.||
||Definitive Software Library||Ensure that the appropriate software packages to be installed are in place in the definitive software library.||
||Configuration database||Be sure to have in place all configuration files. This applies both to those who are in use, and the new ones supplied in systems to be changed or updated.||
||Build management||All scripts and systems used to deploy or create disk images must be in place.||
||Testing||First, run trials on test equipment. When this works without any problems, it can be tested at a school. The school must be fully informed about, and fully in on trying out new software. When one is sure that everything works, you can upgrade for all.||
||Fall-back solution||Even with extensive testing, new releases may go wrong. Therefore it is essential to have a fallback. The easiest solution is to spare have the old installation with data on a separate server machine. Such a machine can be plugged in if the change or upgrade does not work.||

=== Tools ===

As seen from the activity list, one needs several tools to keep track of different releases of software, services and hardware in the system. Some of these tools mentioned previously. But we repeat this anyway:

 * Debian tools for the definitive software library
 * Database for configurations and hardware (subversion setup files, spreadsheets detailing all hardware with physical location)
 * Build management (the system which builds Debian packages)
 * Hardware for testing and backup-solution

=== Relations to other processes ===

Release management goes directly into the core of the ICT service. It goes on implementing appropriate security updates, change in services or upgrading of computer software. Requests for new releases may be due to operational problems or desire new software. Before a new release it is assessed if the change is necessary.

If the change is straightforward one will make necessary changes in configurations and clarify application packages for unrolling. This have been tested, and one will have in place backup solutions. When changes are made, will perhaps have change parts of the operational activity. It's easy to see change management affects all parts of the operating support.

== Tools for operational support ==

The first thing you should ask yourself: &quot;Do we really need software tools?&quot; If that's true, it is crucial to examine the options thoroughly.

Taking a glossy brochure, and listen to sales talk, we are totally dependent on such tools. But good people, good process descriptions, good procedures and job descriptions are a basis for good service management. The need for, and how complicated the tools are, depend on the organsation's need for computer systems, and the size of the organisation.

In a small organisation, will a single freely accessible database be enough for logging and management of events (request tracker). But in larger organisations will almost certainly need a sophisticated distributed and integrated tools for service management. It means linking all processes to a system for event handling.

Although tools can be important, as they are not important in itself. For the tasks and processes to be done, and the information needed which are important. They will provide the necessary information to specify which tools are best suited to support operations. Here are some reasons why one may use software for operational and service management:

 * increased demands from users
 * lack of ICT knowledge
 * budget limitations
 * organisations is entirely dependent on the quality of service
 * integration of systems from multiple vendors
 * increased complexity of ICT infrastructure
 * emergence of international standards
 * Extended scope and changes in ICT

Automatic tools allow:

 * Centralisation of key functions
 * Automation of functions in the service delivery
 * Analysis of data
 * Identification of trends
 * Preventive measures may be implemented

=== Type of tool ===

In this chapter we have proposed a number of tools to improve operational support. Here follows a summary of the tools:

 * Debian tools for the definitive software library
 * Database for configurations and hardware (subversion setup files, spreadsheets detailing all hardware with physical location)
 * Build management (the system which builds Debian packages)
 * Hardware for testing and backup-solution
 * Request Tracker
 * System for monitoring (Munin)

As the operations department gains further experience with systematic routines, additional and different types of tools will be developed or acquired.

=== Evaluation criteria when selecting tools ===

Although it is used large amounts on creating evaluation criteria for software, the result is only experience-based guidelines. There is no final answer to what's good or less good software. As much else it revolves partly about taste. Different solutions do the same job just as well, but may have quite different design. However, here may some rules of thumb be useful.

The main evaluation criterion is whether one needs to do a job at all. Many IT tools are absolutely perfect and works without error, but it solves tasks not needed to be fixed. So the main criterion is whether it resolves the correct problem, and if it at all is necessary to do anything.

 * So the first thing one asks is whether the tool is needed.

If it turns out one will have done a task, the solution my be so simple as to run some commands manually. The simplest way is best. But when one gets many machines to operate, automation becomes crucial. It's too much work to log into 20 similar server machines to do a security upgrade. Then automation is the thing.

 * So here one must ask whether the tool is useful to solve the task
 * Then one must ask whether the tool is usable.

There are often a wide range of programs and procedures to solve a specific task. But some problems solved completely different when maintaining 500 computers and 11 servers, than when fixing your home PC. An example might be tools that allow the teacher to see the desktop of each student on his or hers client machine. The teacher can stop and start programs for all pupils, and prevent individual pupils to use for example IMs when this interferes with school work.

Regarding the choice of operating tools, it's about automation and simplification of operational tasks. It is about making and reduce manual work to a minimum. So the motivation is to just maintain automatic. Also here it is to make things easy, which can be a considerable job to get done.

As you can see, it is not easy to set up good criteria for selection of operating tool for large installations. Most of all, this is because software developers often lack experience in the operation of IT systems. They are known only to create new things, but to create good and relevant tools for operation requires many years of experience.

Some general operational tools have not been replaced the last 20 years. But the products used may have been replaced. Also some programs may in a few years time be irrelevant to use. Therefore, one must rely on training in new editions of the applications used for operation, and in upgrades and changes in user programs.

=== Product training ===
Line 645: Line 532:
Opplæring og produkttrening er regulert i Arbeidsmiljøloven (§ 4-2):

 * Arbeidstakerne og deres tillitsvalgte skal holdes løpende informert om systemer som nyttes ved planlegging og gjennomføring av arbeidet. De skal gis nødvendig opplæring for å sette seg inn i systemene, og de skal medvirke ved utformingen av dem.
Education and product training are in Norway regulated according to the Labour Act (§ 4-2)

 * Employees and their union representatives will be kept informed of systems used in the planning and implementation phases. They should be given the necessary training to familiarize themselves with these systems, and they shall take part in designing them.
Line 651: Line 538:
== Planlegging ved igangsetting av servicestøtte ==

Et stadig økende antall virksomheter ser nødvendigheten av tjeneste-styring. Det er ofte praksis at man baserer beslutninger på historiske og politiske vurderinger, framfor gjeldende behov i virksomheten. Derfor er det viktig å sikre at ledelsen forplikter seg til deltagelse, og forståelse for arbeidsmåten i organisasjonen, og gå gjennom eksisterende prosesser og sammenligne disse med virksomhetens behov og «best practice».

=== Innføring av servicestøtte ===

Helsesjekk

=== Brukbarhetsstudie (Feasibility study) ===

=== Fastslå gjeldende situasjon ===

Helsesjekk

=== Generelle retningslinjer for prosjektplanlegging ===

Forretningstilfelle for prosjektet

Kritiske suksessfaktorer og mulige problemer

Prosjektkostnader

Organisasjonen

Produkt

Planlegging

Kommunikasjonsplan

=== Prosjektgjennomgang og rapportering ===

Fremdrift

Evaluering av prosjektet

Etterarbeide

Gjennomgang for å sjekke samsvar med kvalitetsparametere

Gjennomgang i forhold til nøkkelfaktorer

<references />
== Planning at the start of the implementation of service support ==

A growing number of organisations see the necessity of service control. It is often the practice to base decisions on historical and political considerations, rather than the current organisation's needs. Therefore it is important to ensure that management commits to participation and understanding of the working methods in the organisation, and go through the existing processes and compare these with the organization's needs and &quot;best practice&quot;.

=== Implementing service support ===

Health check

=== Feasibility study ===

&lt; FIXME&gt;

=== Determine current situation ===

Health check

=== General guidelines for project planning ===

Business case for the project

Critical success factors and possible problems

Project costs

Organisation

Product

Planning

Communication plan

=== Project review and reporting ===

Progress

Evaluation of the project

Supplementary work.

Reviewing to check the compliance with the quality parameters

Reviewing in regards to key factors

Service support

As mentioned in the introduction, it is recommended to begin by establishing an office for centralized operations to allow you to manage tickets. The benefits of this come quickly and are visible, which is important for customer and user satisfaction.

After the office is up and running with a sensible workflow for tickets (user requests and troubleshooting) you will move on to the biggest challenge for the organisation. As a rule, this is either change management or problem solving. Organisations with "cowboy" system administrators who come up with smart ideas and implement them without much testing, often begin with change management. For organisations suffering recurring outages, problem solving comes first.

Whatever you choose to start with, a certain amount of configuration management will be necessary. Managing configuration is critical to delivering the software and services for the user. Software must work as expected. In order to make beneficial changes, one must know the configuration of the different programs.

To manage configuration changes you may use a database (Configuration Management Data Base (CMDB)). Few people use a database for all the configurations, and neither do you have to add all configurations to one single database. It's fine to place configurations into multiple smaller and partly independent repositories. Some people, for example, store configurations and setups in version control. But even if you have different repositories, you may get greater benefits if you connect information from the different processes.

For users of Debian Edu, most service configurations lie within a specific directory (/etc). These may benefit from being collected and stored in a central version controlled directory. This makes it easier to restore lost services and setup machines if they are reinstalled. This applies both to servers and user laptops or workstations. As part of the backup system in Debian Edu, a backup is made of the setup directory /etc. But the backup system is nothing more than a database or a version controlled directory for configurations.

Service Desk

The Service Desk is where users submit questions or errors. At school, the ICT-contact often forwards operational events to the Service Desk. There may also be requests like setting up a new PC, or installing a program.

At school the ICT contact is the link to the Service Desk. The ICT contact also responds to the most common questions. Some questions are too difficult to manage at each school and must be forwarded to the Service Desk. It is important to have good cooperation between the school ICT contact and Service Desk operators. Tasks that are too extensive or too difficult to solve locally should be passed to the Service Desk.

Users may also get direct answers from an operator at the Service Desk. All operational enquiries go to the Service Desk. Enquiries will be assigned a case number. Anyone who has registered a case will receive an e-mail confirming that the inquiry has been received. During consideration of the case, those working with it at the Service Desk may send updated status to the user.

This way, users get one point of contact, and service desk operators get an overview of all of the cases. Operations can be expected to troubleshoot across all parts of the organisation. Periodically the team leader needs to go through all issues and solutions in order to prioritize debugging and to prevent re-occurrence of errors, in order to provide schools with a stable operating environment.

Incidents can be reported over the phone, fax, email or web form. Incidents that are more urgent must be prioritized. Incidents that need to be resolved quickly are usually reported by telephone. Less important events are usually reported via eg. email. A member of the support staff should be assigned to the incident and will need to ask the user questions to investigate the problem.

  • Remember to be an active listener, not a passive one.

All enquiries should be logged, and an email confirmation should be sent. It is important that the user should feel safe, and information about what might be the problem should be communicated to them. When the enquiry arrives at the service desk, a brief description of the incident should be logged. The enquiry may be from the ICT contact at the school, or from someone with an agreement to use the service desk. The event logging should happen as soon as possible, and it should be assigned a case number. The user should get a confirmation by email copy that the matter has been received and assigned appropriate case number.

Previously, enquiries were written in paper logbooks. Today software is used to record the enquiries. In English, this is called "Request Tracker". It is crucial for operations to log enquiries. This is basically for error handling, user requests, and prioritization of the various incidents. Log entries are important to prevent recurring errors. Because operational events are periodically reviewed, an assessment of fixes and priorities can be made. The log also provides a basis for improving the service by debugging problem services and applications based on what users perceive as problematic.

Thus the log of requests is a basic and necessary tool both for users and the service desk. There are several freely available systems for logging requests with good documentation <<FootNote(RT Essentials: http://www.oreilly.com/catalog/rtessentials/chapter/index.html )>>. Skolelinux Drift uses RT <<FootNote(RT Essentials: http://www.oreilly.com/catalog/rtessentials/chapter/index.html )>> to handle requests.

One important thing when starting up support is not to get too tough a start. Do not try to achieve everything at once; bet rather on "quick wins" that keep the user informed, and aim for quick response times. It is also important to clarify who the service desk should forward events to, if they can not solve the issue themselves. The support desk must also check whether there will be disruptions for the user. This makes it quick and easy to give feedback.

For the users it is important that incidents are dealt with. For the service office it is important that the incidents are handled correctly according to the service level agreement, and that work requested outside of what was agreed is handled between management at the school and the system administration organisation.

Tasks and roles

We recommend to agree upon what duties the school's ICT contact has and what is the responsibilities are of those who work at the Service Desk. Schools often have little resources compared to what is common in municipal administrations or private companies. At the same time, schools usually have many more users and often more client machines than in use in the rest of the municipality.

To distribute tasks roles must be in place. By having clearly established roles it is easier to distribute tasks and ascertain the working capacity necessary to resolve operational tasks. Operational experience in municipalities and professional organisations shows that these roles are common.

  • ICT contact on each school This is often a teacher with ICT educational and/or technical background.
  • Operator(s) working in the central IT service. This is a person skilled in operations.
  • ICT coordinator who organises the educational use of IT, and contributes towards plans for developmental, operational and educational use. Often this is a teacher.
  • ICT responsible. This is usually the principal who is responsible for IT operations.

Here is an overview of the various everyday tasks, some of which are contracted out by the municipalities.

ICT contact(s) tasks at each school:

  • Oversee the school's server room.
  • To be the school's contact at the municipality - report errors and outages.
  • Perform simple maintenance tasks such as replacing mice and keyboards, upgrading thin clients, and simple patching.
  • To be the school's superuser - to advise colleagues about: the user interface, e-mail, video projectors and relevant applications.
  • Participate in ICT gatherings.
  • Create and administrate local users.
  • Perform simple maintenance of printers.
  • Create and manage email accounts.
  • Perform simple commands and operations under guidance of a ICT-tutor.
  • Facilitate the use of ICT in teaching.

The operator's tasks:

  • Receiving incidents and service requests.
  • Mentor ICT contacts by telephone and e-mail.
  • If agreed, visit the school for troubleshooting defects and malfunctions on computers, printers and servers.
  • Backups.
  • Security software updates on the school's computers (servers and clients).

ICT coordinator's duties:

  • Assist school management and ICT contacts in expanding technical and pedagogical ICT plans.
  • Ensure that the service desk and the management get a good selection of software.
  • Ensure that the schools have appropriate ICT tools for teaching, and that computers and networks are appropriate for school subjects.
  • Provide advice and guidance to operational services on what the technical and pedagogical ICT requirements of the school are.

ICT-responsible duties (principal, headmaster, head of operational services)

  • Make joint purchases of computer equipment and enter into joint agreements etc.
  • Develop competence plans.
  • Provide the schools with courses in the educational use of ICT.
  • Operations course.
  • Negotiate contracts for operations.
  • Ensure that the IT contact and the IT service have the necessary resources.

The advantage of an agreement for these tasks is that expectations on the individual are known, giving a good basis for planning and managing ICT services. Usually these ICT tasks are only done part-time by a teacher who also has teaching duties.

A business would often have two staff members working full time, operating 100 standard client machines with 100 users. In schools there may be a 30% position in total, divided among several persons, operating 100 client computers used by 320 students and teachers.

When the school has so few resources for operations, it is crucial to have good resource management. Making agreements for the tasks can make it easier to assess whether you need additional resources, or to reduce expectations of IT initiatives in schools with regards to the budget. By having a good overview of the ICT tasks in the school, if would be easier for IT administrators to ask for an increase in resources if necessary. There may be a need for increased resources to implement ICT-based exams, or a need for new equipment like whiteboards as teaching aids.

Expected time usage

We've created a table showing time spent on operations and maintenance. The table is based on the experiences of municipalities which implement a centrally operated Debian Edu of 9-10 schools with 250-500 client computers. Several things are not included in the table. Therefore extra time is required for projects where schools develop their own ICT solutions with networking and more equipment.

Role

Operational responsibility

Time spend per school per week

Time spent in total for all schools

Centralised operations staff

Monitoring, debugging and operation of 500 machines, for example, 10 schools with 3,200 students and teachers.

2-3 h(50 clients)

½ position(500 clients)

ICT contact at each school

Oversight of equipment, easy maintenance, and reporting of incidents and requests

3-4 h(50 clients)

1 position(10 schools / 500 clients)

Central ICT-coordinator

Assist in planning and implementation of educational and technical ICT work in the school.

1-2 h

½ position

ICT manager (principal)

Make joint purchases, and ensure compliance with the service level agreement. Schedule updates, or develop solutions

1 h

¼ position

Overall for a school

50 client machines (concurrent users)

6 - 10 h

Overall for all schools

10 schools, 500 client machines (concurrent users)

2 ¼ position

Experience shows that the scope of work of the ICT contact is affected by the number of concurrent users. The term "concurrent users" is new to many. To illustrate with an example: A school may have 250 students but not more than 50 computers. Then a maximum of 50 students can use computers at the same time. This is much less than the total 250 users who have an account on the system. It is these 50 logged in users that provide work for IT service. The other 200 people not logged in give little extra work.

Therefore, it is common to calculate IT costs from the maximum number of concurrent users. Other calculation methods are also possible, for example when paying for proprietary software. But since Debian Edu has no license costs, the number of concurrent users is the most crucial figure for operating costs. To calculate costs from user accounts provide little or no meaning for a school.

For users of Debian Edu the cost difference to manage 100 or 250 user accounts is very small. There are a few exceptions. With 250 students instead of 100, some students may repeatedly forget their password. Therefore, it is wise to let the teacher responsible for the class give these students a new password.

If the school has 50 client machines, the ICT contact needs less time on their operational tasks than if the school has 150 clients. With multiple clients, the overall time spent on the operation increases, but operating time per client machine goes down somewhat.

Several municipalities have set aside 3-4 hours a week to the ICT contacts tasks at each school with 30-70 client machines. The Education Department in Oslo has set aside half a weekday, or a 30% position, to follow up 150 client machines. Experiences from other municipalities suggests that a 20% position is enough to solve the tasks of a local ICT contact when a school has 160 thin or diskless clients with Debian Edu.

In addition there are associated costs of centralized operations, ICT management, and construction of the educational use of ICT tools in school subjects. One position is probably sufficient for the operation of 1000 client machines. When it comes to educational support, several principals have a 50-100% position in the school for this work. There may be a 10-20% position as an ICT contact and a 40-80% position as an educational support for the teachers. Many teachers perceive IT tools in schools to be something new. Some principals wish to give more backing to the educational side by making teacher more confident in using IT tools across the different subjects.

Check list

We have sat up a list of tasks to set up a new service desk.

  • Arrange people in different roles like IT manager, IT contact in schools, central operations and IT coordinator for all schools. It is important to make a distinction between what is technical operations and maintenance, and what is pedagogical work.
  • Establish the service desk such that every school has a service agreement regulating what is standard operating activities, and what is extra. It is imperative that ICT-responsible principals are a part of this process.
  • Establish a system for handling incoming requests (a request tracker). All enquiries by email need a case number. Almost all enquiries from users or IT contacts from schools also need a case number.
  • Ensure that ICT budget reflects the contribution necessary to ensure proper operation of school computer equipment and networks. The requirement today is that the ICT systems will be used for national and local tests with use of ICT tools with or without the Internet.
  • Basically use the standard edition of Debian Edu with the same version on all schools. From this make the changes you want. These changes must be taken care of in a configuration database with documentation of the changes made. Version management can be used to save the changes and documentation.

Incident Management

The purpose of the ICT service is to prevent disturbances like shutdowns or software issues. Users will experience few problems with the ICT system if the ICT service has enough resources to handle operations, equipment and for enquiries to the Service Desk. Small or big problems will cause interruptions for users, so good handling of incidents is necessary.

In parachuting they call near-accidents "incidents". It is perhaps not quite the same in computer operations when something is not working. The purpose of dealing with incidents is to restore services as quickly as possible so that everything works normally. If something goes wrong, it must have the least possible impact on users. What is a "normal service" is agreed through an operating agreement describing the service level.

Statistics of incidents is important, especially if several people work within the organisation. When several people work together, it is easy to lose track of the work. Statistics will point out problem areas that must be addressed more thoroughly than a quick fix from the service desk. For example, there may be many requests to replace forgotten passwords, so it may be wise to let the teacher change passwords for pupils in their class.

An operational disturbance is defined as:

  • an event which is not part of normal operations and causes, or can cause, an interruption or reduction in the quality of the service.

Examples of operational disturbances may be:

  • Programs
    • the office program (OpenOffice.org) does not start

    • the web browser (Firefox) crashes
    • the hard drive is full
  • Hardware
    • the server is down
    • unable to print
    • unable to log in
  • Requests
    • requests for information, advice or documentation
    • forgotten password

The examples show some of the most common operational issues. These are problems that prompt users to contact the school or the service desk. The ICT service must prioritize what must be handled straight away, and which problems need more time to resolve. To prioritize which problems need more comprehensive debugging, it is important to log all enquiries about malfunctions. Once one has an overview of the most common problems, appropriate actions can be taken.

Check list

We have made a short check list to ensure procedures and systems for good event handling are in place.

  • The operator doing the debugging will report the status back to the ICT contact at the school and/or the user.
  • The system for logging events must be available and working (both technically and functionally) for those working with event handling in schools and at the service desk.
  • The event logging system must be used for virtually all operational events.
  • Statistics of the log of events should be made periodically. The statistics can be used to identify and eliminate recurring problems, which are irritating to users.

Planning and implementation

To set up a workable system for logging events requires something more than installing the system. Everyone in the operations department must use the system. Those reporting errors must also receive feedback by email with a ticket number. This requires significant efforts in configuring the system for event logging. In addition, one must ensure basic user training for those who receive the requests.

Large and comprehensive plans are not required to implement proper event handling. Event handling is a completely standard task for those who work at the service desk or as ICT contacts at the schools. Setting up a computer tool for logging events may require up to a few weeks for a correct configuration, and users may also report events via e-mail and by phone.

The user interface to the logging system is relatively self-explanatory, so it should not take too long to get started. Daily use of the system will get users comfortable with what should be logged. It is crucial that everyone in the operations department uses the logging system for operational messages.

To get an idea of activities done following a reported event, we use an example.

A user contacts the service office with a problem, and reports that printing is not working. Operations logs the event immediately after the call is completed. A case is opened for the issue, and automatically given a case number.

Operations at the service desk make a quick analysis. Has the spooler stopped again, or is it something else? Is the paper or toner missing? The operator examines the spooler and sees that queue has filled up. She deletes the queue and tests whether the next job is printed.

This time the print queue fills back up again. Operations contact the school's ICT contact asking to check whether the paper tray is empty. This is listed in the event log. The ICT contact replies that they have refilled the paper tray, and printing is normal. The case is closed, and is noted in the system event log.

If printing had not started again, the toner might have be missing or there might have been a printer error. If there was an error, operations would have to escalate the issue. This means that someone other than the operator or the ICT contact is needed to resolve the problem - in this example, a technician who can fix printers.

This example shows the whole workflow that needs to be investigated to get a printer working again. If a printer does not work even after checking that paper and toner are available, the issue needs to be escalated. The operations department must call in an expert to fix the problem - this time it was a service technician for printers.

What was wrong and what the fix was are noted in the event logging system.

Roles

A variety of roles are involved when the ICT service deals with reported issues. In the example above, the school's ICT contact and the operator cooperate to solve the printing problem. Had the issue been more difficult, they would have had to call a technician. If the printer could not be fixed, a new one would have to be purchased. If the school needed to buy a new printer, the ICT managers might need to arrange payment. In many organisations, the principal has the last word.

In short, it is easy for many people to get involved when something does not work. If possible, problems should be solved on the spot, trying to avoid including unnecessary people. Escalating problems which could be solved locally quickly becomes costly. Many enquiries are easy to deal with there and then, but other requests involve more complex problems which involve more people. If additional or external help is needed to solve the problem, this must as a rule be clarified with the operations manager. The important thing is to be aware of these points when handling operating events, so as to use resources appropriately.

Key points

We have sat up some key points for handling incidents. These points can be helpful in evaluating whether or not things are going well by using measurable and well-defined requirements. Such measurement points are:

  • Total number of operational incidents.
  • Average time from receiving an inquiry to when the issue is resolved, classified with codes (a well organized operation department has codes for different types of events and errors).
  • Percentage of incidents handled within agreed response time (as agreed in the service level agreement).
  • Average cost for each event
  • Percentage of incidents solved by the service desk without escalation
  • Events per client machine (workplace)
  • Number and percentage of incidents solved by the operations center without the need for visits to school

Tools

A number of tools can make it easier to handle operational incidents.

  • Automatic logging
  • Automatic routing of events to the right persons
  • Automatic retrieving of data from the database for configuration management
  • Phone and email are used in conjunction with tools for registering requests and incidents.

Problem Management

Problem management is an "investigative" process. Known bugs are most often handled directly by the service desk. This is the most common form of event handling. To investigate unknown errors requires both common sense and instinct. Good operating people use instinct to go straight to the problem, find the solution and restore service as quickly as possible so that everything works normally.

Problem management is;

  • Problem management
  • Checking errors
  • Proactive control to prevent problems
  • Identify error patterns, using information from, for example, event management

Problem control

  • Identify problems
  • Classify problems
  • Examine/research problems

Error control

  • Identify and register known errors
  • Find temporary solutions if possible
  • Contacting those with responsibility for Change Management to remove the error permanently

Proactive control

  • Identify and solve problems and errors before the incident is reported by users.
  • Use logs and information from event handling to see how problems may arise

Procedures for problem management

The Skolelinux/Debian Edu manual is a comprehensive collection of solutions for solving problems and configuring systems. Everything is on the Debian wikipedia pages. Solutions are maintained with the help of staff in schools, municipal ICT services, professional individuals and volunteers. See links to the English pages: https://wiki.debian.org/!DebianEdu/Documentation/Manuals The pages are being translated to Norwegian bokmål. We are working to link the pages to bokmål too.

The Wiki technology has proven to be a great success for maintaining catalogued information on the internet. It's easy to contribute to and all changes are logged. It is also possible to import OpenOffice.org documents, and export documents as PDF.

Configuration Management

The resources spent on IT systems in schools must be handled in a financially prudent manner in order to control the services used and the equipment / infrastructure. The equipment, software and services have a whole range of settings - this is configuration, or a logical model of how infrastructure and services are set up.

To manage configuration it must be identified, saved and maintained. One must also be able to keep track of different versions of the configurations. We call each part of a setup for a Configuration Item (CI). A configuration file may, for example, ensure that certain users have access to a few printers in the network. Another can make sure you get a buffer on diskless clients.

An updated database for configuration management is essential to ensure rapid and controlled handling of operational issues, or changes in the layout of machines, programs or services.

Planning

It takes planning to set up a database for configuration management. One must decide in which areas to use the system, the objective, policies and processes for storage and maintenance of configurations.

  • Identify and select a structure for configuration according to the important parts of the ICT infrastructure. Configuration owners, name tags (attributes), dependencies, and relations between configurations all need to be considered.
  • Only approved configurations are managed in the database through the lifetime of the system. Control over access to the configurations can be done with group permissions, and can be done through the process of Change Management.
  • Status logging - keeps track of the condition and status of the various subsystems. This applies throughout the lifetime of the service, software or hardware. There may be a configuration in production, disconnected or discontinued.
  • Checking and revision. Each configuration must be checked to confirm that the correct information is stored in the configuration database (CMDB). This is followed up with periodic reviews to ensure that the database is up to date.

As we see, there is a lot of planning needed in order to have configuration management in the IT system. The purpose of planning as part of IT operations is to ensure that systems are fixed quickly when they go down. With a good configuration management, it is easy to replace a defective machine with a new one. The configurations can be quickly transferred to the new computer and the IT system functions just as well as before.

Management of Configuration Items (CI)

A configuration item is a part of the infrastructure. It is normally the configuration of a service or a program. Some times users want to change how a service work. One need to keep track of the configurations if changes are made.

To get this down to earth we can imagine the configuration of the printer server. You want to add a new printer to the computer network and will add this to the printing system CUPS. When changing one configuration through a web application or via configuration in KDE. CUPS config file will change, and you must restart the printer server again. This can be done in KDE tools or through a web application. The modified setup file is copied to a directory where the file can be handled by a version system.

Of many different choices there are a few common ones. This is if a service should: run, stop, terminate, start, be interrupted or taken out.

One should be cautious in changing configurations without a proper plan. It is easy to forget what you have done on a server or a PC. Therefore it is important to document the changes made in a change log.

Planning and installation

The configuration of the computer network is connected to the architecture. Much of the planning is done with Debian Edu. This is because it may take both 3 and 4 weeks to set up servers with corresponding service level with Windows server, RedHat or other GNU/Linux distributions. Debian Edu takes this with 1-2 hours. If you want a fixed IP address for the network a professional uses ½ hour extra on this. This is because web services are set up with reusable names.

What then must be planned is which additional user program to use, and which subsystems should interact with Debian Edu. It may, for example. be that the school has an electronic whiteboard.

Check list

We have made a list of activities and solutions that are important in good configuration management.

  • Establish a version-controlled area for saving configurations for all servers and selected workstations and laptops. Git and SVN are often used for this. Remember to take daily backup of the area, and make sure to save all changes in configurations.
  • Use an electronic system for taking care of recipes explaining configurations of different type machines, the network and services. Such recipes contributes to others who help or take over operations can read up on what is done. A wiki can be suitable for this.
  • Use one specific version of the operating system and software on all machines. This is to avoid maintaining many different versions of the software. Ensure that the software is well tested. Therefore, it may be wise to wait 6-12 months before adopting latest edition of a program.

Relations to other processes

Management of configurations are closely connected with the handling of problems and if the systems are available. If printing stops to often, it may be that a configuration change solves the problem. It may, for example, be to establish a routine for deleting the print queue and restart the print service anew.

The aim of the changes you make in the configurations are usually to increase the availability of services or programs. It may also be to restrict access to certain programs or services to specific times. To achieve this, one must reconfigure the service. In addition, it may cost money beyond what was agreed on as service level or capacity of the system.

The examples show that the managing configurations engages a number of other areas. Therefore there is much to gain by putting in place good practices for managing changes in configurations. Also automation is advisable if you want greater stability, or access to certain services in specific periods.

Tools for configuration management

As mentioned under Check list one may use

  • Saving the configuration files in a version-control system, for example subversion.
  • Wiki for storing documentation of setup and wizards
  • Use a common directory for operational documentation on the internet, maintained by Skolelinux/Debian edu staff in the schools.

Change Management

Many ICT services are not clever in handling changes in ICT systems. Leading to many disgruntled users. Surveys in the public sector in Denmark show that operating costs go down when you have good control on the changes. Therefore, it pays to involve users with training and participation related to the changes made.

Change-messages is entirely dependent on proper processes. This applies regardless of whether the changes are small or big. Therefore it is important to have in place the right people when making changes, both to give training and to have people to answer questions. This becomes especially important when adopting new releases of software and services. This is independent of whether one uses free or proprietary software.

Change Management should ensure that all changes are made in a standardized and right manner. It is important to anchor the decision about amending at the appropriate level in the organisation, Standard changes can often be pre-approved when they are done a few times. But major changes will often involve a higher decision level between school management and operator.

The reason why the management should be included is that an upgrade will often require training of users. It may be upgrading to a new browser or a new version of office software. This can quickly lead to a half day training in what is new in a program. Such changes must be agreed with the management. The changes must also be done without the other parts of the system stops working.

Those with responsibility for approving changes receives a so-called change message or RFC (Request For Change). When you have a RFC you can assess whether the change should be performed. Many times you have to clarify with management if optional changes should be made, and if so, when it will happen.

By changes one must also cooperate with the school's ICT responsible. One must ensure that changes occur when it fits with the schools plans. To implement significant changes without Change Management can lead to much dissatisfaction and additional inquiries to the Service Desk. This would provide significant extra work without this being planned. In addition, it may lead to a change that would soon be rolled back. You fast get twice as much work without ending anywhere else than back to start. Had one made the necessary approvals, may the change be done in a planned and straightforward manner.

Change Management is done to avoid more extra work than what's necessary. Making changes obviously requires more work, but you will get less extra work on the changes planned. One also avoids the need to roll back changes, because problems arise where users are unprepared for substantial changes.

When you for example update the entire system to a new version, make sure that everyone is informed. One must look into whether those affected by the change need training. The right professionals must prepare it all, so there are no surprises.

All responsibility must not land on the person responsible for managing versions of software, the release manager. Release handling is a process which preferably should work with changes that contains many minor changes. This usually happens when rolling out new systems and services, or the upgrading of the entire system to a new version.

Activities

  • See change message, or RFC (Request For Change) above, and check it also has got a unique number.
  • Prioritize and categorize the changes
  • Remove not possible changes. This can be done by marking them as not possible.
  • Give feedback to the one giving the change message
  • Make sure you have a Change Advisory Board, where the change is dealt with, discussed and evaluated. This consulting group can be selected ICT contacts and operations personnel with long experience.
  • Coordinate changes with the Release Management which handle different versions of applications and services.
  • Look over and finish the changing message (RFC)
  • Remember to save modified configurations in the repository for configuration files.
  • Reports

Even what may look like a small insignificant change message can have major consequences for if the change is implemented. We have examples of schools that have a stable Debian Edu network where all the programs work. A test version of a popular program crashing constantly, is installed, and Debian Edu get blamed.

An example is schools that have installed the test version of the latest OpenOffice.org before the program was finally finished. Several thought it could be fun and try out. The problem is that the test editions are usually released to find errors and instability in applications. They are not intended for production use

In production, the general rule is that you don't install test versions of software. Most operators recommend using the next to latest version of a program intended for production. After 6-12 months are usually the worst errors picked out of a new main version of an application.

It means one often wait until summer before updating to a program that were reissued just before New Year. This fits well with the school year. The alternative may be instability and irritated users. Therefore the advisory group plays a key role when done small or large changes.

Release Management

Release handling is management and planning activities preparing for wanted changes. The changes can be small or large, where large changes can consist of many smaller changes. Release management goes on before initiating the actual job of installing software and hardware into production.

First the planning and testing of new releases are carried out. Then it all is rolled out it into production. Deployment is part of the infrastructure management. The procedure is to implement what is planned, tested and is ready within the systems for Configuration Management. Once everything is planned, tested and configurations are stored, then roll out the solution in production.

Usually, many service providers and suppliers are involved. This applies both to the acquisition of machines, the software used, and the recommended configurations. Good resource planning is crucial to package and distribute a new release in a good way for users. Cutting corners in this area can lead to equipment that doesn't work, or that goes unused because of deficiencies in the installation.

Release Management takes a comprehensive approach by the change in a service, and ensure that all parts of a publication is seen in context. This applies to both technical and non-technical factors.

Basic

As you can see, for computers, software and network to work as planned, release-management is crucial. Proper handling of releases prevents disruptions. New releases or changes can be introduced while operations continue as normal, without interruption or reduction in quality.

Implementing changes or new releases can be compared to building a new road. Cars must still get past even if you build a new road on top of the old. Good signs must be in place. One must also have the necessary resources to rebuild the road. If you lack the resources to make changes, it's better to let it be.

Some might think that proper release management is boring as one doesn't get to implement the latest version every time something new is released. But often the operations department lacks the resources to handle a flood of complaints should an upgrade fail. High uptime requires established technology, as said by Linux expert David Elboth in the Linux Magazine (1/2004). He writes:

  • The more you demand of the system the more stringent are the requirements of the individual components. High requirements for uptime results also show that the choices you are left with are old technology. Only empirical data over time can say anything about downtime. We have all noticed how far behind are Red Hat and !SuSE with their server products.

To get few complaints, with a stable and reliable environment, requires solid release management. Alternatively, a bunch of complaints and dissatisfied users emerge, caused by installing insufficiently tested cutting-edge software. Amateurs have a tendency to underestimate the consequences of software upgrades. If something works fine on your home computer, it does not mean that this will work in a wide network with 500 client computers and 3200 users.

Definitive Software Library (DSL)

A software archive in an operational context is a collection of original copies of the software in use. If you use Skolelinux 2.0, this is the software package. The phrase software archive is used differently in some other contexts, especially among programmers. When it comes to operations, we would be talking about the original software package of a particular version which is used for the installation.

By using free software, the software archive may be Skolelinux 2.0 plus the extra programs you have added from various sources. There may be certain versions of Macromedia Flash, Java and decoders which make it possible to run national tests in the browser, or to watch broadcasts from a national TV station.

If you plan to upgrade to the next version of Debian Edu when released, this new version shall be the main program archive. The new archive shall also include appropriate versions of all additional applications beyond Debian Edu.

Set-up files customized or created locally by the operations department are not included in the main program archive. Configurations are saved separately in a version-control system or database.

Database for configurations and hardware

As mentioned in the chapter on configuration management, you must create a database or a version-controlled directory to take care of set-up files. One should also keep track of all computers, what kinds of machines are in use, performance, and unique standard addresses on the network cards (MAC addresses).

There are many reasons to have an overview of the equipment. One of the main reasons is to keep track of how many machines are in operation, how many are not in use and how many are being repaired. Another reason is planning for upgrades.

Build management

A variety of applications in addition to browser and office suite are installed in schools. Educational programs for learning, browser plug-ins, and programs for multimedia are needed. The systems also have network set-up and changed settings in specific programs. When you have many servers and perhaps thousands of clients, the need for effective tools for deployment, soon makes itself felt. Such tools are standard in Debian Edu.

Build management is about ensuring that you always install the required software packages, services and proper settings both of individual programs and for the network. Many people have heard about the so-called "images". One installs the operating system with all needed programs and configures the network. Then one uses an image program to make a copy of the hard disk. This "disk image" can then be copied to other computers.

It is not necessary to build such disk images. Debian Edu is based on Debian which has an excellent package management system. There is no need to compile applications, as ready-made packages can be installed directly from the Internet. It is enough to work out what changes you want to the default set-up of Debian Edu or the main program archive in use. Then you make one or more scripts to run on each machine that get everything installed and set up.

For most situations, scripting is an easy way to "build" and roll out programs and configurations. But there are situations where building disk images may be the solution, e.g. for installation on many laptops.

As we see, handling the construction process is about facilitating deployment on many computers. In exceptional cases, this may involve building a tailor-made Debian package. But in most situations, everything is ready-packaged. Then you have to put in place a script which installs additional programs and certain settings. One can also create disk images if you have many similar machines, such as laptops for all students

Testing

It is essential to test new applications, configurations, and new services before they are put into production. Several schools have experienced instability when they have installed software without making the necessary adjustments. Therefore it is crucial to test changes in configurations or new versions of the software before the change is made on all machines.

Testing generally takes place in three steps.

  • First, do an installation of the changes on a test network. This is technical testing to check that everything hangs together in a system with few users. Take care to include all changes in configuration files.
  • When you are sure that everything works on the technical side, try installing the solution in one school. It is very important to agree about the testing with the school's ICT contact. Users must also be fully briefed on changes made for the sake of testing. Take care to preserve current adjustments in the set-up files, which may have been made in the course of normal maintenance.
  • When you are sure everything works, you can roll out the solution to all schools. It is easiest to create a script that simplifies upgrading of software packages, services and configurations.

Fall-back solution

Much can go wrong during a new installation or upgrade. Therefore, one must have ready a fall-back solution. This lets one quickly get back to the system as it was before the upgrade. In technical terms, this is called roll-back.

When rolling back it is absolutely essential to have ready the previous version of the software archive and configuration files. This means that you can install for example Edu 1.0 in under an hour, and put in place the appropriate configuration files.

But roll-back takes time. Therefore, it may be prudent to have a server ready with the previous version of the software, the right configurations, and a recent copy of the users' home directories. This server can quickly replace any machines on which the upgrade does not go according to plan. Having server machines in reserve can ensure high availability even if something goes wrong.

Advantages and possible problems

The advantage of having records of the software in production can't be underestimated. Many rely on having the software on their respective CDs and DVDs. This is inefficient distribution. To save time and trouble all the software in Debian Edu is available online.

Your operating department can create a copy of the Debian Edu archive on a central server. From here, all the software can quickly and smoothly be installed on other machines. The advantage is that your ICT service has a constant overview of the versions of the software they have made available to schools. This also prevents the installation of software that has not been considered by the Change Management.

There may be considerable problems if you do not maintain the software archive and configurations. It can also lead to mistakes with a configuration or software package. Then this gets rolled out to all machines. In addition, some schools may install insufficiently tested software or use beta releases in production. So one must have good processes and have someone to take responsibility for maintenance of the program archive and configurations.

It may seem like one needs a lot of extra things in place in order to install and maintain the services and programs that are in use. However, if you skip the tools that provide management of upgrades, you give yourself a lot of extra work. The ICT service must spend a lot of time on manual installation on each machine. The danger of making mistakes increases. When things do not work you get disgruntled users, and much time is spent fixing problems.

Many operating major IT systems have inadequate plans for changes. Some have no plans at all, but just installing new versions of software. Changes made can be perceived as problematic for some users, because functions they are comfortable with changes place in the user interface. For operations it can go completely wrong. For example when they should upgrade to from older version of Windows to newer in Arendal municipality, most stopped working. ICT service said they had several computer program that was held together with "wire and tape." It took half a year to clean it up.

Planning and implementation

The reason for planning before implementing changes is to avoid weeks or months of delay due to problems. The time used for planning is quickly regained because one avoids additional problems. There will always be people who say they have had no problems with ad hoc changes in the systems; but closer examination reveals that there are problems after such changes, they merely don't get communicated.

In our eyes ad-hoc solutions are only a detour through changes, and only an emergency measure. An ad-hoc solution is like a temporary repair with "wire and tape." One must in due course clean up such solutions to ensure stable operation without constant surprises. Skipping a planning phase leads to many more ad hoc solutions, and several operational problems when changes or upgrades are done. Therefore it is essential that professionals and management understand the value of a good planned process for changes.

Therefore, we recommend that you convene a meeting for planning, and make a stepwise plan for changes in the system. A stepwise plan will naturally vary according to the change. Upgrading the OpenOffice.org suite is quite different from upgrading the whole system. When upgrading to a new office application, a 2-3 hour tour of the office suite may be enough for the teacher in each school. When upgrading the entire system one must both provide user training and test that the technical details work as intended.

You'll find few shortcuts is the main point when it comes to planning and implementation. Studies show that those who plan properly and ensure that people have the right skills, have lower operating costs for the operation.

Activities

It is crucial to plan new releases. Most modifications of the system should be clarified with the management. The following list of activities is designed to support the upgrades in a planning and implementation phase.

Tasks

Details

Prioritization of the release:

Check if the necessary decisions are made before a change or upgrade would be deployed.

Definitive Software Library

Ensure that the appropriate software packages to be installed are in place in the definitive software library.

Configuration database

Be sure to have in place all configuration files. This applies both to those who are in use, and the new ones supplied in systems to be changed or updated.

Build management

All scripts and systems used to deploy or create disk images must be in place.

Testing

First, run trials on test equipment. When this works without any problems, it can be tested at a school. The school must be fully informed about, and fully in on trying out new software. When one is sure that everything works, you can upgrade for all.

Fall-back solution

Even with extensive testing, new releases may go wrong. Therefore it is essential to have a fallback. The easiest solution is to spare have the old installation with data on a separate server machine. Such a machine can be plugged in if the change or upgrade does not work.

Tools

As seen from the activity list, one needs several tools to keep track of different releases of software, services and hardware in the system. Some of these tools mentioned previously. But we repeat this anyway:

  • Debian tools for the definitive software library
  • Database for configurations and hardware (subversion setup files, spreadsheets detailing all hardware with physical location)
  • Build management (the system which builds Debian packages)
  • Hardware for testing and backup-solution

Relations to other processes

Release management goes directly into the core of the ICT service. It goes on implementing appropriate security updates, change in services or upgrading of computer software. Requests for new releases may be due to operational problems or desire new software. Before a new release it is assessed if the change is necessary.

If the change is straightforward one will make necessary changes in configurations and clarify application packages for unrolling. This have been tested, and one will have in place backup solutions. When changes are made, will perhaps have change parts of the operational activity. It's easy to see change management affects all parts of the operating support.

Tools for operational support

The first thing you should ask yourself: "Do we really need software tools?" If that's true, it is crucial to examine the options thoroughly.

Taking a glossy brochure, and listen to sales talk, we are totally dependent on such tools. But good people, good process descriptions, good procedures and job descriptions are a basis for good service management. The need for, and how complicated the tools are, depend on the organsation's need for computer systems, and the size of the organisation.

In a small organisation, will a single freely accessible database be enough for logging and management of events (request tracker). But in larger organisations will almost certainly need a sophisticated distributed and integrated tools for service management. It means linking all processes to a system for event handling.

Although tools can be important, as they are not important in itself. For the tasks and processes to be done, and the information needed which are important. They will provide the necessary information to specify which tools are best suited to support operations. Here are some reasons why one may use software for operational and service management:

  • increased demands from users
  • lack of ICT knowledge
  • budget limitations
  • organisations is entirely dependent on the quality of service
  • integration of systems from multiple vendors
  • increased complexity of ICT infrastructure
  • emergence of international standards
  • Extended scope and changes in ICT

Automatic tools allow:

  • Centralisation of key functions
  • Automation of functions in the service delivery
  • Analysis of data
  • Identification of trends
  • Preventive measures may be implemented

Type of tool

In this chapter we have proposed a number of tools to improve operational support. Here follows a summary of the tools:

  • Debian tools for the definitive software library
  • Database for configurations and hardware (subversion setup files, spreadsheets detailing all hardware with physical location)
  • Build management (the system which builds Debian packages)
  • Hardware for testing and backup-solution
  • Request Tracker
  • System for monitoring (Munin)

As the operations department gains further experience with systematic routines, additional and different types of tools will be developed or acquired.

Evaluation criteria when selecting tools

Although it is used large amounts on creating evaluation criteria for software, the result is only experience-based guidelines. There is no final answer to what's good or less good software. As much else it revolves partly about taste. Different solutions do the same job just as well, but may have quite different design. However, here may some rules of thumb be useful.

The main evaluation criterion is whether one needs to do a job at all. Many IT tools are absolutely perfect and works without error, but it solves tasks not needed to be fixed. So the main criterion is whether it resolves the correct problem, and if it at all is necessary to do anything.

  • So the first thing one asks is whether the tool is needed.

If it turns out one will have done a task, the solution my be so simple as to run some commands manually. The simplest way is best. But when one gets many machines to operate, automation becomes crucial. It's too much work to log into 20 similar server machines to do a security upgrade. Then automation is the thing.

  • So here one must ask whether the tool is useful to solve the task
  • Then one must ask whether the tool is usable.

There are often a wide range of programs and procedures to solve a specific task. But some problems solved completely different when maintaining 500 computers and 11 servers, than when fixing your home PC. An example might be tools that allow the teacher to see the desktop of each student on his or hers client machine. The teacher can stop and start programs for all pupils, and prevent individual pupils to use for example IMs when this interferes with school work.

Regarding the choice of operating tools, it's about automation and simplification of operational tasks. It is about making and reduce manual work to a minimum. So the motivation is to just maintain automatic. Also here it is to make things easy, which can be a considerable job to get done.

As you can see, it is not easy to set up good criteria for selection of operating tool for large installations. Most of all, this is because software developers often lack experience in the operation of IT systems. They are known only to create new things, but to create good and relevant tools for operation requires many years of experience.

Some general operational tools have not been replaced the last 20 years. But the products used may have been replaced. Also some programs may in a few years time be irrelevant to use. Therefore, one must rely on training in new editions of the applications used for operation, and in upgrades and changes in user programs.

Product training

Thorough user training makes a lot of support can be done informally in direct conversation between users. Often training costs as little as 1% of the total operating costs. It is well worth spending a little more on training. The effect is very positive. The same applies proper training for ICT contacts in schools, and operators. Training of ICT contacts to use simple systems for password change, error messages, etc.. will provide better quality of calls to the IT service.

Education and product training are in Norway regulated according to the Labour Act (§ 4-2)

  • Employees and their union representatives will be kept informed of systems used in the planning and implementation phases. They should be given the necessary training to familiarize themselves with these systems, and they shall take part in designing them.

So in short it can be advantageous to increase efforts in training, which will improve ICT service and provide a significant cost reduction. This is because users and IT contacts becomes more confident and better to help each other. It should also be noted that the transition to new software can also provide an opportunity to simplify some of the operating practices. Simplification can reduce the requirement for product training.

Planning at the start of the implementation of service support

A growing number of organisations see the necessity of service control. It is often the practice to base decisions on historical and political considerations, rather than the current organisation's needs. Therefore it is important to ensure that management commits to participation and understanding of the working methods in the organisation, and go through the existing processes and compare these with the organization's needs and "best practice".

Implementing service support

Health check

Feasibility study

< FIXME>

Determine current situation

Health check

General guidelines for project planning

Business case for the project

Critical success factors and possible problems

Project costs

Organisation

Product

Planning

Communication plan

Project review and reporting

Progress

Evaluation of the project

Supplementary work.

Reviewing to check the compliance with the quality parameters

Reviewing in regards to key factors