|
OSA Questions and Answers
Before contacting osa-support@roe.ac.uk if you have a
query about accessing the OSA please see
if it has already been covered in the Q&A listed below. (note: please do not contact
individuals in the Wide-Field Astronomy Unit, use osa-support@roe.ac.uk).
General look-ups between tables *Source and *Detection.
In general when querying the *Source table, you may find that certain catalogue quantities
that you need are not present. This is because the table design in the OSA is to
write only a subset of the most useful attributes available from *Detection into *Source (for
efficiency reasons). For example, CASU standard source extraction produces 13 aperture
magnitudes for every detection in every passband and these are of course present in
*Detection, but in *Source you will only see a subset of 3 per passband - aperMags 1, 3
and 4 for example. How do you query if you are interested in an aperture magnitude
other than 1, 3 or 4 - say aperMag2 in the G band for an ATLAS query?
The answer is to do a join query between *Source and *Detection,
noting that each row of *Source is a member of a frame set the details of
which are stored in *MergeLog (hence the join has to include *MergeLog as well).
The SQL syntax is best illustrated by an example:
SELECT top 10 s.ra, s.dec, gd.aperMag2 as gAperMag2, s.gAperMag3
FROM atlasSource AS s, atlasMergeLog AS l, atlasDetection as gd
WHERE
/* Join source to merge log for the frame set: */
s.frameSetID=l.frameSetID AND
/* Join merge log to detection for the J frame: */
l.gmfID = gd.multiframeID AND l.geNum = gd.extNum AND
/* Join source to detection to look up the detection required: */
s.gSeqNum = gd.seqNum and gd.seqnum >0
Note that the final predicate could be done via objID:
... AND s.jobjID = jd.objID but this is most inadvisable for
efficiency reasons (the attribute combination multiframe number,
extension number, sequence number is the primary key in
*Detection so look-ups are very fast on that combination) and
furthermore for the more subtle reasons detailed
below.
Default detection rows and objID or (multiframeID,extNum,seqNum) for look-ups between *Source and *Detection.
In the example *Source-to-*Detection join query given
above,
one might be tempted to use objID as the look-up into *Detection in place
of the more long-winded predicate combination involving multiframeID,
extNum and seqNum since both the former attribute and the latter
3-attribute combination are unique. Users should be aware, however, that there
are default rows in the *Detection table to allow joins between *Source
and *Detection that allow selection of *Source rows that have incomplete
passband coverage when querying that table in conjunction with a join with
*Detection. Since there is a default row in *Detection
for every (non-default) multiframe/extension number combination, looking up a row in
*Detection via objID alone when that attribute is a default produces many matches
for all those default rows in *Detection having that same default value in that table.
Those rows could be excluded by an addition WHERE ... AND objID > 0 but this is
very inefficient and will be inappropriate if you are trying to select source rows that
may have non-detections (and therefore default objIDs) in some passbands. Hence,
ALWAYS use the correct relational association of the combination
of multiframeID, extNum and seqNum when joining *Source
to *Detection, as in the example query given above.
Trade-off between completeness and reliability in point source sample selections
There is always a trade-off between completeness and reliability when using
catalogue data, since no image analysis software is perfect. The
morphological classification attributes class and classStat
allow you to make a choice about how to maximise completeness, or
conversely minimise contamination, at query time - for examples of this,
consult the SQL scripts defined under "views" in the schema browser. Note
that although the integer classification scheme is based on the N(0,1) measure
of stellarness (classStat) it has several overrides built in to attempt to
make it more reliable. ClassStat is equivalent to a straight likelihood
measure based directly on the curve-of-growth of the aperture fluxes.
But because the statistic is not strictly speaking a Gaussian distribution
and has much broader tails, the Class parameter has assorted overrides
invoked based on saturation, ellipticity as a function of magnitude and so
on. The most reliable way of getting a reasonably complete stellar-like sample
is to require class of -1 and -2 on the bands in question and then if you
are concerned about completeness add in any extra sources with ClassStat
in the range 0 to +n-sigma; where n is between 3 and 5 depending on how
non-stellar you want to push things.
Compressed image format (CFITSIO RICE compression)
CASU pipeline processed images are supplied in a compact, compressed format
to reduce online storage and network transfer overheads. The format employed
is known as
Rice
tile compression, and is implemented in the CFITSIO library
that is used by the core processing software. By default, images are
delivered from the WSA in this format: network transfer will be a
factor 3 to 4 times quicker, and end-user storage requirements a factor
3 to 4 times smaller, if you download the compressed image files (the
compression ratio depends on the noise properties of the image in question;
these figures are typical for short WFCAM exposures, as in the UKIDSS
Large Area Survey). Hence
it makes a great deal of sense to stick to using Rice compressed
products as opposed to selecting standard decompressed images (which are
also available). The only disadvantage is that the Rice compressd format
is not universally supported in applications software at present; note however that it is easy
to download and install
an image copy utility (imcopy)
that can decompress to standard FITS (the Rice
compressed images are actually stored as FITS binary tables and not
standard FITS images - this is why some image display utilities will not
handle them correctly). On a system with Starlink you can compile imcopy using
gcc -I/star/include -L/star/lib -lm imcopy.c -lcfitsio -o imcopy
The OSA imcopy page gives a step by step guide to
downloading and installation.
CFITSIO Quick Start Guide includes additional informational on
installation.
When trying to view or uncompress our FITS files, in some rare cases you
may come across one or both of the following error messages:
FITSIO status = 1: non-CFITSIO program error
decompression error: hit end of compressed byte stream
This is because our FITS file images have been compressed using a newer
version of fpack & CFITSIO. The only solution is to first uncompress the
FITS file using a version of CFITSIO >= 3.1 (fpack >= 1.0) prior to
viewing.
Querying date/time quantities
All dates and time stamps used in the data flow system (i.e. from the
telescope/instrument, through the pipeline processing and at the archive
end) are universal time co-ordinate (UTC) quantities; the
SQL data type for these quantities in the archive is the MS SQL Server
datetime type. To query a datetime type, e.g. compare a datetime
quantity against a literal constant, simply express that constant as
a string in double quotes in the following format:
SELECT count(*)
FROM Multiframe
WHERE dateObs >'2005-Apr-18 14:00:00.0'
will return a count of the number of frames in the archive that have
observation dates and times since 14:00:00.0 18th April 2005 (UTC)
Querying string quantities
String quantities (SQL data types char or varchar) can be
queried in the WHERE clause of an SQL statement and compared using
certain comparators (see the Cookbook) in the same way as numerical
quantities. Perhaps the most useful comparator, however, is LIKE.
For example, you can select all archived FITS file observations from a
given date string in the filename (say 1st April 2005) by the following
statement input into the freeform SQL box:
SELECT fileName
FROM Multiframe
WHERE fileName LIKE '%20050401%'
Note the use of double quotes for string literals, and the use of the
wildcard percent character to match any string.
Querying cross-matched catalogue data (e.g. ATLAS v. SDSS)
OSA catalogue tables (i.e. those created for the VST surveys) are
automatically cross-matched to a number of external survey
catalogue datasets held locally, e.g. SDSS Data Releases; FIRST, IRAS, ROSAT
and 2MASS catalogues; and legacy photographic catalogues like USNO-B and
the SuperCOSMOS Sky Surveys held within the
SuperCOSMOS Science Archive (for
a complete list, click on "Browser" on a navigation bar).
Rather than prejoining any two datasets to create a static, merged table of
what are assumed to be associated sources on the basis of a fixed joining
algorithm, the OSA philosophy is to create a table of pointers between
any two matched datasets. This means that any externally catalogued source
co-incident or nearby a VST source is readily available, out to some
predefined maximum angular search radius, and all sorts of science usages
are possible because the exact matching criteria (e.g. maximum allowable
displacement, consistency in morphological parameters or classification,
or even the number of possible external matches for a given VST source)
can be tuned at query time by simple expressions in SQL. Furthermore, all
attributes of both datasets are available to querying, because no decision
has been made as to which attributes to propagate into a static merged set.
The flip side to this flexibility is the rather strange overall syntax for
querying cross-matched data in SQL: instead of querying a single merged
table, e.g. SELECT * FROM MergedTable WHERE ...,
in general you have to query three tables: the two cross-matched
tables and the table of pointers, e.g.
SELECT *
FROM Table1, ExternalDB..Table2, CrossNeighboursTable2
WHERE ...
A complete tutorial, and example queries employing cross-matched data,
are available in Section 4 of the OSA Cookbook.
Coordinates - decimal degrees, sexagesimal, radians:
Coordinates in the OSA are primarily stored as decimal degrees. The region
search page does accept sexagesimal format but the results are returned
as decimal degrees. Elsewhere e.g. entering constraints in the menu query or freeform SQL pages values must be entered as decimal degrees.
Note: RA and Dec values are returned as decimal degrees except when written
to FITS files where radians are used .
How to compute the median in SQL:
Microsoft Transact-SQL provides useful extensions to the SQL standard that
enable easy computation of the median value of an attribute: one simply
selects the middle value from an ordered list, generated by a nested
query, of the attribute in question as in the following example which
selects the median ellipticty of all detector frames:
SELECT MAX(T.avStellarEll) FROM
(SELECT TOP 50 PERCENT avStellarEll FROM MultiframeDetector
WHERE avStellarEll > -0.9E9 ORDER BY avStellarEll ASC) AS T
Note the exclusion of any default values of the attribute in the query. The
query can be generalised easily to produce the Nth percentile by
changing the number in "TOP 50 PERCENT".
Connection problems - port 8080 (wireless network):
OSA access is carried on port 8080. Please make
sure your network/firewall has this port open (this was first reported
by a user trying to access the SSA from a wireless network.
Home |
Overview |
Browser |
Access |
Login |
Cookbook
Listing |
FreeSQL
Links |
Credits
WFAU, Institute for Astronomy,
Royal Observatory, Blackford Hill
Edinburgh, EH9 3HJ, UK
osa-support@roe.ac.uk
23/4/2015
|