osa logo
OSA Home
Start Here
Data Overview
Known Issues
the Surveys
Coverage Maps
Schema browser
Data access
Archive Listing
Freeform SQL
SQL Cookbook
Release History
osa logo bottom
Home | Overview | Browser | Access | Login | Cookbook 
  OSA logo

OSA Questions and Answers

Before contacting osa-support@roe.ac.uk if you have a query about accessing the OSA please see if it has already been covered in the Q&A listed below. (note: please do not contact individuals in the Wide-Field Astronomy Unit, use osa-support@roe.ac.uk).

General look-ups between tables *Source and *Detection.
In general when querying the *Source table, you may find that certain catalogue quantities that you need are not present. This is because the table design in the OSA is to write only a subset of the most useful attributes available from *Detection into *Source (for efficiency reasons). For example, CASU standard source extraction produces 13 aperture magnitudes for every detection in every passband and these are of course present in *Detection, but in *Source you will only see a subset of 3 per passband - aperMags 1, 3 and 4 for example. How do you query if you are interested in an aperture magnitude other than 1, 3 or 4 - say aperMag2 in the G band for an ATLAS query? The answer is to do a join query between *Source and *Detection, noting that each row of *Source is a member of a frame set the details of which are stored in *MergeLog (hence the join has to include *MergeLog as well). The SQL syntax is best illustrated by an example:
SELECT top 10 s.ra, s.dec, gd.aperMag2 as gAperMag2, s.gAperMag3
FROM atlasSource AS s, atlasMergeLog AS l, atlasDetection as gd
/* Join source to merge log for the frame set: */
s.frameSetID=l.frameSetID AND
/* Join merge log to detection for the J frame: */
l.gmfID = gd.multiframeID AND l.geNum = gd.extNum AND
/* Join source to detection to look up the detection required: */
s.gSeqNum = gd.seqNum and gd.seqnum >0
Note that the final predicate could be done via objID: ... AND s.jobjID = jd.objID but this is most inadvisable for efficiency reasons (the attribute combination multiframe number, extension number, sequence number is the primary key in *Detection so look-ups are very fast on that combination) and furthermore for the more subtle reasons detailed below.

Default detection rows and objID or (multiframeID,extNum,seqNum) for look-ups between *Source and *Detection.
In the example *Source-to-*Detection join query given above, one might be tempted to use objID as the look-up into *Detection in place of the more long-winded predicate combination involving multiframeID, extNum and seqNum since both the former attribute and the latter 3-attribute combination are unique. Users should be aware, however, that there are default rows in the *Detection table to allow joins between *Source and *Detection that allow selection of *Source rows that have incomplete passband coverage when querying that table in conjunction with a join with *Detection. Since there is a default row in *Detection for every (non-default) multiframe/extension number combination, looking up a row in *Detection via objID alone when that attribute is a default produces many matches for all those default rows in *Detection having that same default value in that table. Those rows could be excluded by an addition WHERE ... AND objID > 0 but this is very inefficient and will be inappropriate if you are trying to select source rows that may have non-detections (and therefore default objIDs) in some passbands. Hence, ALWAYS use the correct relational association of the combination of multiframeID, extNum and seqNum when joining *Source to *Detection, as in the example query given above.

Trade-off between completeness and reliability in point source sample selections
There is always a trade-off between completeness and reliability when using catalogue data, since no image analysis software is perfect. The morphological classification attributes class and classStat allow you to make a choice about how to maximise completeness, or conversely minimise contamination, at query time - for examples of this, consult the SQL scripts defined under "views" in the schema browser. Note that although the integer classification scheme is based on the N(0,1) measure of stellarness (classStat) it has several overrides built in to attempt to make it more reliable. ClassStat is equivalent to a straight likelihood measure based directly on the curve-of-growth of the aperture fluxes. But because the statistic is not strictly speaking a Gaussian distribution and has much broader tails, the Class parameter has assorted overrides invoked based on saturation, ellipticity as a function of magnitude and so on. The most reliable way of getting a reasonably complete stellar-like sample is to require class of -1 and -2 on the bands in question and then if you are concerned about completeness add in any extra sources with ClassStat in the range 0 to +n-sigma; where n is between 3 and 5 depending on how non-stellar you want to push things.

Compressed image format (CFITSIO RICE compression)
CASU pipeline processed images are supplied in a compact, compressed format to reduce online storage and network transfer overheads. The format employed is known as Rice tile compression, and is implemented in the CFITSIO library that is used by the core processing software. By default, images are delivered from the WSA in this format: network transfer will be a factor 3 to 4 times quicker, and end-user storage requirements a factor 3 to 4 times smaller, if you download the compressed image files (the compression ratio depends on the noise properties of the image in question; these figures are typical for short WFCAM exposures, as in the UKIDSS Large Area Survey). Hence it makes a great deal of sense to stick to using Rice compressed products as opposed to selecting standard decompressed images (which are also available). The only disadvantage is that the Rice compressd format is not universally supported in applications software at present; note however that it is easy to download and install an image copy utility (imcopy) that can decompress to standard FITS (the Rice compressed images are actually stored as FITS binary tables and not standard FITS images - this is why some image display utilities will not handle them correctly). On a system with Starlink you can compile imcopy using

gcc -I/star/include -L/star/lib -lm imcopy.c -lcfitsio -o imcopy
The OSA imcopy page gives a step by step guide to downloading and installation. CFITSIO Quick Start Guide includes additional informational on installation.

When trying to view or uncompress our FITS files, in some rare cases you may come across one or both of the following error messages:

FITSIO status = 1: non-CFITSIO program error 

decompression error: hit end of compressed byte stream

This is because our FITS file images have been compressed using a newer version of fpack & CFITSIO. The only solution is to first uncompress the FITS file using a version of CFITSIO >= 3.1 (fpack >= 1.0) prior to viewing.

Querying date/time quantities
All dates and time stamps used in the data flow system (i.e. from the telescope/instrument, through the pipeline processing and at the archive end) are universal time co-ordinate (UTC) quantities; the SQL data type for these quantities in the archive is the MS SQL Server datetime type. To query a datetime type, e.g. compare a datetime quantity against a literal constant, simply express that constant as a string in double quotes in the following format:
SELECT count(*)
FROM Multiframe
WHERE dateObs >'2005-Apr-18 14:00:00.0'
will return a count of the number of frames in the archive that have observation dates and times since 14:00:00.0 18th April 2005 (UTC)

Querying string quantities
String quantities (SQL data types char or varchar) can be queried in the WHERE clause of an SQL statement and compared using certain comparators (see the Cookbook) in the same way as numerical quantities. Perhaps the most useful comparator, however, is LIKE. For example, you can select all archived FITS file observations from a given date string in the filename (say 1st April 2005) by the following statement input into the freeform SQL box:
SELECT fileName
FROM Multiframe
WHERE fileName LIKE '%20050401%'
Note the use of double quotes for string literals, and the use of the wildcard percent character to match any string.

Querying cross-matched catalogue data (e.g. ATLAS v. SDSS)
OSA catalogue tables (i.e. those created for the VST surveys) are automatically cross-matched to a number of external survey catalogue datasets held locally, e.g. SDSS Data Releases; FIRST, IRAS, ROSAT and 2MASS catalogues; and legacy photographic catalogues like USNO-B and the SuperCOSMOS Sky Surveys held within the SuperCOSMOS Science Archive (for a complete list, click on "Browser" on a navigation bar). Rather than prejoining any two datasets to create a static, merged table of what are assumed to be associated sources on the basis of a fixed joining algorithm, the OSA philosophy is to create a table of pointers between any two matched datasets. This means that any externally catalogued source co-incident or nearby a VST source is readily available, out to some predefined maximum angular search radius, and all sorts of science usages are possible because the exact matching criteria (e.g. maximum allowable displacement, consistency in morphological parameters or classification, or even the number of possible external matches for a given VST source) can be tuned at query time by simple expressions in SQL. Furthermore, all attributes of both datasets are available to querying, because no decision has been made as to which attributes to propagate into a static merged set. The flip side to this flexibility is the rather strange overall syntax for querying cross-matched data in SQL: instead of querying a single merged table, e.g. SELECT * FROM MergedTable WHERE ..., in general you have to query three tables: the two cross-matched tables and the table of pointers, e.g.
FROM Table1, ExternalDB..Table2, CrossNeighboursTable2
A complete tutorial, and example queries employing cross-matched data, are available in Section 4 of the OSA Cookbook.

Coordinates - decimal degrees, sexagesimal, radians:
Coordinates in the OSA are primarily stored as decimal degrees. The region search page does accept sexagesimal format but the results are returned as decimal degrees. Elsewhere e.g. entering constraints in the menu query or freeform SQL pages values must be entered as decimal degrees. Note: RA and Dec values are returned as decimal degrees except when written to FITS files where radians are used

. How to compute the median in SQL:
Microsoft Transact-SQL provides useful extensions to the SQL standard that enable easy computation of the median value of an attribute: one simply selects the middle value from an ordered list, generated by a nested query, of the attribute in question as in the following example which selects the median ellipticty of all detector frames:
(SELECT TOP 50 PERCENT avStellarEll FROM MultiframeDetector
WHERE avStellarEll > -0.9E9 ORDER BY avStellarEll ASC) AS T
Note the exclusion of any default values of the attribute in the query. The query can be generalised easily to produce the Nth percentile by changing the number in "TOP 50 PERCENT".

Connection problems - port 8080 (wireless network):
OSA access is carried on port 8080. Please make sure your network/firewall has this port open (this was first reported by a user trying to access the SSA from a wireless network.

Home | Overview | Browser | Access | Login | Cookbook
Listing | FreeSQL
Links | Credits

WFAU, Institute for Astronomy,
Royal Observatory, Blackford Hill
Edinburgh, EH9 3HJ, UK