#
# stub function definition file for docstring parsing
#
[docs]def sdbaseline(infile, datacolumn='data', antenna='', field='', spw='', timerange='', scan='', pol='', intent='', reindex=True, maskmode='list', thresh=5.0, avg_limit=4, minwidth=4, edge=[0, 0], blmode='fit', dosubtract=True, blformat='text', bloutput='', bltable='', blfunc='poly', order=5, npiece=2, applyfft=True, fftmethod='fft', fftthresh=3.0, addwn=[0], rejwn='', clipthresh=3.0, clipniter=0, blparam='', verbose=False, updateweight=False, sigmavalue='stddev', showprogress=False, minnrow=1000, outfile='', overwrite=False):
r"""
Fit/subtract a spectral baseline
[`Description`_] [`Examples`_] [`Development`_] [`Details`_]
Parameters
- infile_ (path) - name of input SD dataset
- datacolumn_ (string='data') - name of data column to be used ["data", "float_data", or "corrected"]
- antenna_ (string='') - select data by antenna name or ID, e.g. "PM03"
- field_ (string='') - select data by field IDs and names, e.g. "3C2*" (""=all)
- spw_ (string='') - select data by IF IDs (spectral windows), e.g. "3,5,7" (""=all)
- timerange_ (string='') - select data by time range, e.g. "09:14:0~09:54:0" (""=all) (see examples in help)
- scan_ (string='') - select data by scan numbers, e.g. "21~23" (""=all)
- pol_ (string='') - select data by polarization IDs, e.g. "XX,YY" (""=all)
- intent_ (string='') - select data by observational intent, e.g. "*ON_SOURCE*" (""=all)
- reindex_ (bool=True) - Re-index indices in subtables based on data selection
- maskmode_ (string='list') - mode of setting additional channel masks ["list" or "auto"]
.. raw:: html
<details><summary><i> maskmode = auto </i></summary>
- thresh_ (double=5.0) - S/N threshold for linefinder
- avg_limit_ (int=4) - channel averaging for broad lines
- minwidth_ (int=4) - the minimum channel width to detect as a line
- edge_ (intVec=[0, 0]) - channels to drop at beginning and end of spectrum
.. raw:: html
</details>
- blmode_ (string='fit') - baselining mode ["fit" or "apply"]
.. raw:: html
<details><summary><i> blmode = fit </i></summary>
- dosubtract_ (bool=True) - subtract baseline from input data [True, False]
- blformat_ ({string, stringVec}='text') - format(s) of file(s) in which best-fit parameters are written
- bloutput_ ({string, stringVec}='') - name(s) of file(s) in which best-fit parameters are written
.. raw:: html
</details>
.. raw:: html
<details><summary><i> blmode = apply </i></summary>
- bltable_ (string='') - name of baseline table to apply
.. raw:: html
</details>
- blfunc_ (string='poly') - baseline model function
.. raw:: html
<details><summary><i> blfunc = poly </i></summary>
- order_ (int=5) - order of baseline model function
- clipthresh_ (double=3.0) - clipping threshold for iterative fitting
- clipniter_ (int=0) - maximum iteration number for iterative fitting
.. raw:: html
</details>
.. raw:: html
<details><summary><i> blfunc = chebyshev </i></summary>
- order_ (int=5) - order of baseline model function
- clipthresh_ (double=3.0) - clipping threshold for iterative fitting
- clipniter_ (int=0) - maximum iteration number for iterative fitting
.. raw:: html
</details>
.. raw:: html
<details><summary><i> blfunc = cspline </i></summary>
- npiece_ (int=2) - number of element polynomials for cubic spline curve
- clipthresh_ (double=3.0) - clipping threshold for iterative fitting
- clipniter_ (int=0) - maximum iteration number for iterative fitting
.. raw:: html
</details>
.. raw:: html
<details><summary><i> blfunc = sinusoid </i></summary>
- applyfft_ (bool=True) - automatically set wave numbers of sinusoids
- fftmethod_ (string='fft') - method for automatically set wave numbers of sinusoids ["fft"]
- fftthresh_ (double=3.0) - threshold to select wave numbers of sinusoids
- addwn_ (intVec=[0]) - additional wave numbers to use
- rejwn_ (intVec='') - wave numbers NOT to use
- clipthresh_ (double=3.0) - clipping threshold for iterative fitting
- clipniter_ (int=0) - maximum iteration number for iterative fitting
.. raw:: html
</details>
.. raw:: html
<details><summary><i> blfunc = variable </i></summary>
- blparam_ (string='') - text file that stores per spectrum fit parameters
- verbose_ (bool=False) - output fitting parameters to logger [True, False]
.. raw:: html
</details>
- updateweight_ (bool=False) - update WEIGHT column [True, False]
.. raw:: html
<details><summary><i> updateweight = True </i></summary>
- sigmavalue_ (string='stddev') - value used for computing weight
.. raw:: html
</details>
- showprogress_ (bool=False) - (NOT SUPPORTED YET) show progress status for large data [True, False] (NOT SUPPORTED YET)
.. raw:: html
<details><summary><i> showprogress = True </i></summary>
- minnrow_ (int=1000) - (NOT SUPPORTED YET) minimum number of input spectra to show progress status
.. raw:: html
</details>
- outfile_ (string='') - name of output file
- overwrite_ (bool=False) - overwrite the output file if already exists [True, False]
.. _Description:
Description
Task **sdbaseline** fits and/or subtracts a baseline from
single-dish spectra in MS format. Given parameters that define the
baseline to be fit (function type, order or the polynomial, etc.),
**sdbaseline** computes the best-fit baseline for each spectrum
using the least-squares fitting method and, if you want, subtracts
it. The best-fit baseline parameters (including baseline type,
coefficients of basis functions, etc.) and other values such as
residual rms can be saved in various formats including ascii text
(in human-readable format or CSV format) or a baseline table (a
CASA table). Task **sdbaseline** also has a mode to 'apply' a
baseline table to MS data. For each spectrum in the MS, the
best-fit baseline is reproduced from baseline parameters stored in
the specified baseline table, and subtracted. Putting the "fit"
and "subtract" into separate processes can be useful for pipeline
processing of huge datasets.
.. rubric:: Baseline Model Functions
The user can specify the function to be used for the baseline with
the *blfunc* parameter (e.g. *blfunc = 'poly'*). In general,
polynomial fitting is stable. Sinusoid fitting is a special mode
that could be useful for data that clearly shows a standing wave
in the spectral baseline.
In addition to fitting with a single function type, users can also
specify unique baseline fitting parameters for each spectrum by
setting *blfunc='variable'*. See 'Per-spectrum Fit Parameters'
section below for details.
.. rubric:: Output Files
The task outputs the baseline-subtracted MS data set. Users
should specify the output data file name with the *outfile*
keyword.
Also, the fit parameters, terms, and rms of the baseline can be
saved into an ascii text file (in human-readable format or CSV
format) or a baseline table (a CASA table). By default, a text
file named <infile name> + '\_ blparam.txt' is output. The
saved baseline table can be used later to subtract the baselines
from an MS.
.. rubric:: Fitting and Clipping
In general, least-squares fitting is strongly affected by extreme
data points, making the resulting fit poor. Sigma clipping is an
iterative baseline fitting method that clips data based on a
certain threshold. The threshold is set as a certain factor times
the rms of the resulting (baseline-subtracted) spectra. If sigma
clipping is on, baseline fit/removal is performed several times,
iteratively. After each baseline subtraction, data whose absolute
value is above the threshold are excluded from the next round of
fitting. By using sigma clipping, extreme data are excluded from
the fit so the resulting fit is more robust.
The user can control the rms multiplication factor using the
parameter *clipthresh,* for the clipping threshold. The actual
threshold for sigma clipping will then be (clipthresh) x (rms of
spectra). Also, the user can specify the maximum number of
iterations with the parameter *clipniter*.
In general, sigma clipping will make the procedure slower since it
increases the number of fits per spectra. However, it is strongly
recommended to turn on sigma clipping unless you are sure that the
data is free from any kind of extreme values that may affect the
fit.
.. rubric:: Update Weight
Setting the parameter *updateweight = True*, the WEIGHT column is
updated as :math:`1/(sigmavalue)^2` according to the *sigmavalue*
parameter ("stddev" or "rms"), where "stddev" calculates the
standard deviation of the baseline-subtracted spectrum and "rms"
does the root mean square. The calculation is done with unflagged
channels only.
Note that the SIGMA column is not updated; it keeps the values of
the input MS data. In case the user wants to refer to the
standard deviation of the output MS data, she or he needs to
compute it using WEIGHT column values as :math:`1/\sqrt{WEIGHT}`
- the SIGMA column should not be refered to.
.. rubric:: Per-spectrum Fit Parameters
Per-spectrum baseline fitting parameters can be applied when
*blfunc = 'variable'*.
The fitting parameters can be defined in a text file and
specified in the *blparam* parameter. Each line of the text file
should store baseline fitting parameters for its corresponding
spectrum in the input MS. It must be a comma-separated text and
contain values in the following order:
(1) 'row': row index
(2) 'pol': polarization index in the specified row
(3) 'mask': channel range(s) used for the fitting (see examples below).
(4) 'clipniter': maximum number of times of iterative fitting (identical to the task parameter *clipniter*)
(5) 'clipthresh': clipping threshold for iterative fitting (identical to the task parameter *clipthresh*)
(6) 'use_linefinder': "true" or "false". Note that linefinder does not run with per-spectrum fitting now even if setting "true", due to a bug which will be fixed in the future
(7) 'thresh': S/N threshold for linefinder (identical to the task parameter *thresh*). Blank is accepted when you don't use linefinder
(8) 'left_edge': channels to drop at beginning of spectrum (identical to the first element of the task parameter *edge*)
(9) 'right_edge': channels to drop at end of spectrum (identical to the second element of the task parameter *edge*)
(10) 'avg_limit': channel averaging for broad lines (identical to the task parameter *avg_limit*)
(11) 'blfunc': baseline model function (identical to the task parameter *blfunc*)
(12) 'order': order of polynomial function (identical to the task parameter *order*). Needed when (11) is "poly" or "chebyshev". It will be ignored when other values are set for blfunc
(13) 'npiece': number of the element polynomials of cubic spline curve (identical to the task parameter *npiece*). Needed when (11) is "cspline"
(14) 'nwave': a list of sinusoidal wave numbers. Needed when (11) is "sinusoid". The maximum wave numbers should not exceed the ((number of channels/2)-1) limit. If the offset is present in the data, add 0 to the number of waves. That is, nwave=[0] is a constant term, nwave=[0,1,2] fits with a maximum of 2 sinusoids, and so on.
Note that the following task parameters will be ignored/overwritten
when *blfunc = 'variable'* is specified (i.e., when per-spectrum
fitting is executed):
- for iterative clipping: *clipniter*, *clipthresh*
- for linefinder: *thresh*, *edge*, *avg_limit*
- for baseline model function: *blfunc*, *order*, *npiece*, *applyfft*, *fftmethod*, *fftthresh*, *addwn*, *rejwn*
Note also that:
(1) lines starting with '#' will be ignored and can be used as
comments
(2) for MS spectra which have no corresponding line in the text
file, baseline fitting is not executed
Examples of text file:
(1) a simple one:
::
0,0,,2,3,false,,,,,poly,5,,[]
0,1,1500~7500,0,3.,false,0.,0,0,0,chebyshev,10,0,[]
1,0,,4,2.5,true,5.,70,80,3,cspline,,6,[]
1,1,0~4000;6000~8000,0,,false,,,,,sinusoid,,,[0,1,2,3,4,5,6,7]
#2,0,,0,,false,,,,,poly,10,,[]
(2) same setting as (1), but with detailed comments:
::
# for row 0, pol 0: no channel mask,
# iterative (twice at maximum) clipping at 3 sigma,
# no linefinder,
# fitting with polynomial of order 5
0,0,,2,3,false,,,,,poly,5,,[]
# for row 0, pol 1: use channel range 1500 to 7500,
# no iterative clipping (clipniter=0),
# no linefinder,
# fitting with Chebyshev polynomial of order 10
0,1,1500~7500,0,3.,false,0.,0,0,0,chebyshev,10,0,[]
# for row 1, pol 0: no channel mask,
# iterative (4 times at maximum) clipping at 2.5 sigma,
# using linefinder (thresh: 5.0 sigma,
# left_edge: 70 channels,
# right_edge: 80 channels,
# avg_limit: 3),
# fitting with cubic spline with 6 elements
1,0,,4,2.5,true,5.,70,80,3,cspline,,6,[]
# for row 1, pol 1: use channel ranges (0 to 4000) and (6000 to 8000),
# no iterative clipping,
# no linefinder,
# fitting with sinusoids with wave numbers up to 7
1,1,0~4000;6000~8000,0,,false,,,,,sinusoid,,,[0,1,2,3,4,5,6,7]
# for row 2, pol 0: no baseline fitting as the line is commented out
#2,0,,0,,false,,,,,poly,10,,[]
.. _Examples:
Examples
.. rubric:: Example 1
This is one of the simplest examples. To fit and remove a
Chebyshev polynomial function (default is of 5th order) from the
data 'sd_data.ms', using only spectral window 0, and fitting
channels 100-800 and 1200-2000 (to avoid, for example, band-pass
roll off at the edges, and perhaps an emission line that might
occur over channels 800-1200).
::
sdbaseline(infile='sd_data.ms', spw='0:100~800;1200~2000', blfunc='chebyshev',
outfile='sd_data.ms.bl', overwrite=True)
.. rubric:: Example 2
This example shows fitting and subtracting a sinusoidal baseline.
To fit and remove a sinusoid from the data 'sd_data.ms', using
spectral window 0 and scan number 0. Wave numbers of sinusoids are
set autmatically in the fft method.
::
sdbaseline(infile='sd_data.ms', spw='0', scan='0', blfunc='sinusoid', applyfft=True,
fftmethod='fft', outfile='sd_data.ms.bl', overwrite=True)
.. rubric:: Example 3
In this example, the user specifies different fitting parameters
per spectrum, using blfunc='variable' and specifying the fit
parameters using a text file.
::
sdbaseline(infile='sd_data.ms', blfunc='variable', blparam='blparam.txt',
outfile='sd_data.ms.bl', overwrite=True)
Here is the text file "blparam.txt" used in the above example.
::
#row,pol,mask,clipniter,clipthresh,use_linefinder,thresh,Ledge,Redge,avg_limit,blfunc,order,npiece,nwave
0,0,100~750;1250~1900,0,3.,false,0.,0,0,0,chebyshev,2,0,[]
0,1,,0,3.,false,0.,0,0,0,chebyshev,0,0,[]
1,0,0~500;1500~2000,0,3.,false,0.,0,0,0,poly,1,0,[]
Here is the text file "blparam.txt" used as a sinusoid fitting example.
::
#row,pol,mask,clipniter,clipthresh,use_linefinder,thresh,Ledge,Redge,avg_limit,blfunc,order,npiece,nwave
0,0,100~750;1250~1900,0,3.,false,0.,0,0,0,sinusoid,,,[0,1,3,5,12]
0,1,,0,3.,false,0.,0,0,0,sinusoid,,,[1,5,6]
1,0,0~500;1500~2000,0,3.,false,0.,0,0,0,sinusoid,,,[0,1,2,3,4,5,7,12]
.. rubric:: Example 4
This is an example of fitting and subtracting a polynomial
baseline, and also updating the WEIGHT column of the output MS
'sd_data.ms.bl' as :math:`1/RMS^2` .
::
sdbaseline(infile='sd_data.ms', blfunc='poly', updateweight=True, sigmavalue='rms',
outfile='sd_data.ms.bl', overwrite=True)
.. rubric:: Example 5
This example shows a polynomial baseline fitting, but without subtraction;
instead, the fitting results are saved as a text file 'sd_data_blparam.txt'
and a baseline table 'sd_data_blparam.bltable', which can be used for
actual baseline subtraction afterwards (see also Example 6).
::
sdbaseline(infile='sd_data.ms', blfunc='poly', dosubtract=False, blformat=['text','table'])
.. rubric:: Example 6
This example shows applying a baseline table to a MS to actually subtract
the best-fit baseline.
::
sdbaseline(infile='sd_data.ms', blmode='apply', bltable='sd_data_blparam.bltable',
outfile='sd_data.ms.bl')
.. _Development:
Development
No additional development details
.. _Details:
Parameter Details
Detailed descriptions of each function parameter
.. _infile:
| ``infile (path)`` - name of input SD dataset
.. _datacolumn:
| ``datacolumn (string='data')`` - name of data column to be used ["data", "float_data", or "corrected"]
.. _antenna:
| ``antenna (string='')`` - select data by antenna name or ID, e.g. "PM03"
.. _field:
| ``field (string='')`` - select data by field IDs and names, e.g. "3C2*" (""=all)
.. _spw:
| ``spw (string='')`` - select data by IF IDs (spectral windows), e.g. "3,5,7" (""=all)
.. _timerange:
| ``timerange (string='')`` - select data by time range, e.g. "09:14:0~09:54:0" (""=all) (see examples in help)
.. _scan:
| ``scan (string='')`` - select data by scan numbers, e.g. "21~23" (""=all)
.. _pol:
| ``pol (string='')`` - select data by polarization IDs, e.g. "XX,YY" (""=all)
.. _intent:
| ``intent (string='')`` - select data by observational intent, e.g. "*ON_SOURCE*" (""=all)
.. _reindex:
| ``reindex (bool=True)`` - Re-index indices in subtables based on data selection. Ignored when blmode='apply'.
.. _maskmode:
| ``maskmode (string='list')`` - mode of setting additional channel masks. "list" and "auto" are available now.
.. _thresh:
| ``thresh (double=5.0)`` - S/N threshold for linefinder
.. _avg_limit:
| ``avg_limit (int=4)`` - channel averaging for broad lines
.. _minwidth:
| ``minwidth (int=4)`` - the minimum channel width to detect as a line
.. _edge:
| ``edge (intVec=[0, 0])`` - channels to drop at beginning and end of spectrum
.. _blmode:
| ``blmode (string='fit')`` - baselining mode ["fit" or "apply"]
.. _dosubtract:
| ``dosubtract (bool=True)`` - subtract baseline from input data [True, False]
.. _blformat:
| ``blformat ({string, stringVec}='text')`` - format(s) of file(s) in which best-fit parameters are written ["text", "csv", "table" or ""]
.. _bloutput:
| ``bloutput ({string, stringVec}='')`` - name(s) of file(s) in which best-fit parameters are written
.. _bltable:
| ``bltable (string='')`` - name of baseline table to apply
.. _blfunc:
| ``blfunc (string='poly')`` - baseline model function ["poly", "chebyshev", "cspline", "sinusoid", or "variable"(expert mode)]
.. _order:
| ``order (int=5)`` - order of baseline model function
.. _npiece:
| ``npiece (int=2)`` - number of element polynomials for cubic spline curve
.. _applyfft:
| ``applyfft (bool=True)`` - automatically set wave numbers of sinusoids
.. _fftmethod:
| ``fftmethod (string='fft')`` - method for automatically set wave numbers of sinusoids
.. _fftthresh:
| ``fftthresh (double=3.0)`` - threshold to select wave numbers of sinusoids
.. _addwn:
| ``addwn (intVec=[0])`` - additional wave numbers to use
.. _rejwn:
| ``rejwn (intVec='')`` - wave numbers NOT to use
.. _clipthresh:
| ``clipthresh (double=3.0)`` - clipping threshold for iterative fitting
.. _clipniter:
| ``clipniter (int=0)`` - maximum iteration number for iterative fitting
.. _blparam:
| ``blparam (string='')`` - text file that stores per spectrum fit parameters
.. _verbose:
| ``verbose (bool=False)`` - output fitting parameters to logger
.. _updateweight:
| ``updateweight (bool=False)`` - update WEIGHT column based on sigmavalue computed over unmasked range
.. _sigmavalue:
| ``sigmavalue (string='stddev')`` - value used for computing weight ["stddev" or "rms"]
.. _showprogress:
| ``showprogress (bool=False)`` - (NOT SUPPORTED YET) show progress status for large data
.. _minnrow:
| ``minnrow (int=1000)`` - (NOT SUPPORTED YET) minimum number of input spectra to show progress status
.. _outfile:
| ``outfile (string='')`` - name of output file
.. _overwrite:
| ``overwrite (bool=False)`` - overwrite the output file if already exists
"""
pass