+ All Categories
Home > Documents > here - Conferences @FOS - The University of Auckland

here - Conferences @FOS - The University of Auckland

Date post: 12-Feb-2022
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
113
BIOMETRICS ON THE LAKE The International Biometric Society Australasian Region Conference 29 November - 3 December 2009 Taupo, New Zealand
Transcript
Page 1: here - Conferences @FOS - The University of Auckland

Biometrics on the LakeThe International Biometric Society Australasian Region Conference

29 November - 3 December 2009Taupo New Zealand

The International Biometric Society Australasian Region Conference

The International Biometric Society Australasian Region Conference

Biometrics on the Lake

29 November - 3 December 2009Taupo New Zealand

OrganisersThe conference is being organised by the following teams of people

LOCAL ORGANISING COMMITTEE (LOC)

Neil Cox neilcoxagresearchconz

Melissa Dobbie MelissaDobbiecsiroau

Harold Henderson haroldhendersonagresearchconz

Hans Hockey hansbiometricsmatterscom

Kathy Ruggiero (Chair) kruggieroaucklandacnz

SCIENTIFIC PROGRAM COMMITTEE (SPC)

David Baird davidvsnconz

James Curran (Chair) jcurranaucklandacnz

Graham Hepworth hepworthunimelbeduau

Hans Hockey hansbiometricsmatterscom

The International Biometric Society Australasian Region Conference

WELCOME TO BIOMETRICS ON THE LAKE

From the President of the region

I would like to give you a warm welcome to Biometrics on the Lake the 2009 conference of the Australasian Region of the International Biometric Society On behalf of the Local Organising Committee and the Scientific Program Committee I can say that it is great to have you here in Taupo to participate in this meeting which is sure to be another excellent event following in the footsteps of its predecessor at Coffs Harbour two years ago We are sure that you will enjoy both the scientific and social aspects of the conference With high quality keynote and invited speakers a great range of contributed talks and posters and the chance to mix with colleagues it will no doubt prove to be very beneficial professionally And you can look forward to an enjoyable social program in this superb part of the world

The organisers are delighted with the number of delegates at the conference ndash about 120 at the time of writing ndash in addition to accompanying persons whom we are very pleased to welcome as well Thirteen different countries are represented and we thank those from outside the region who have travelled large distances to get here Not surprisingly New Zealand has the largest number (about half of all delegates) but that wasnrsquot a hard one to win

These four days together provide a wonderful opportunity to renew friendships and professional contacts and forge new ones We look forward to catching up with many of you personally

Graham HepworthPresident Australasian RegionInternational Biometric Society

The International Biometric Society Australasian Region ConferenceThe International Biometric Society Australasian Region Conference

CONTENTSConference at a Glance 2

Keynote Speakers 4

Invited Speakers 6

General Information 7

Venue Information amp Map 8

Organised Social Activities 9

Sponsors 14

Venue Floor Plan 16

Conference Timetable 17

Oral Presentation Abstracts 25

Poster Presentation Abstracts 89

Index of Presenting Authors 102

Delegates List 107

2 The International Biometric Society Australasian Region Conference

CONFERENCE AT A GLANCEWelcome Reception - Sunday 29 Nov

A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

Keynote addresses

Louise Ryan (Monday 30 Nov) Martin Bland (Tuesday 1 Dec) Thomas Lumley (Wednesday 2 Dec) Chris Triggs (Thursday 3 Dec)

All keynote addresses begin at 9am and will be held in Swifts (see map of venue on page 16)

Invited Speakers

Ross Ihaka (1330 Monday 30 Nov) Alison Smith (1330 Wednesday 2 Dec)

Kaye Basford (1100 Thursday 3 Dec)

All invited speaker talks will be held in Swifts (see map of venue on page 16)

Organised Social Activities - Tuesday 1 Dec

This is a long-standing part of the conference program so keeping with tradition we have arranged four options for the afternoon of Tuesday 1 December after lunch We hope that you will find at least one of these activities attractive because we want you to relax get a breath of fresh air (or sulphur fumes at Orakei Korako) have fun and see some of what this part of New Zealand has to offer especially for first-time visitors These activities are optional and tickets need to be purchased for them through the conference organisers Preferences will be considered on a first come first served basis If you have queries about any of the social activities please contact Hans Hockey in the first instance An afternoon snack will be provided on the cruise and kayaking while the bus tour visits a cafe and the jet boating trip is too action-packed for eating

If you have not already registered for one of these activities please talk to someone at the registration desk to make arrangements before Tuesday morning Please note that all costs for the activities are in New Zealand Dollars

3The International Biometric Society Australasian Region Conference

Conference At A Glance

Conference dinner - Wednesday 2 Dec

To add some novelty to the conference program the conference dinner will be held at the Prawn Park restaurant home of Shawn the Prawn At the Prawn Park just 10 minutes drive north of Taupo on the Waikato River (see map on page 8) you will be able to compete at prawn fishing or the Killer Prawn Hole-in-One Golf Challenge take a guided tour of the nursery and hatchery enjoy fun and interactive water features and a glass of bubbly in the geothermal footbath as well as a sumptuous meal with breathtaking views at the riverside restaurant (on the patio weather permitting) Drinks (wine and non-alcoholic) will be provided and all dietary requirements can be catered for

Coaches have been arranged to transfer delegates to Huka Prawn Farm from the Suncourt Hotel leaving 6 pm with return trips at the conclusion of the event

Conference Prizes - Donated by CSIRO Mathematical and

Information Sciences

Prizes will be awarded for the best oral presentation and the best poster presentation by a young statistician as judged by a panel To be eligible for these awards the presenter must be a member of the IBS Australasian Region and be either a student (full-time or part-time) or a person who has graduated with a Bachelorrsquos Degree (in a biometrical-related field) within the last five years or a person awarded a Postgraduate Degree within the past year

4 The International Biometric Society Australasian Region Conference

KEYNOTE SPEAKERSMartin Bland University of York

Martin Bland joined the University of York as Professor of Health Statistics in 2003 Before this he spent 27 years at St Georgersquos Hospital Medical School University of London following posts at St Thomasrsquos Hospital Medical School and in industry with ICI He has a BSc in mathematics an MSc in Statistics and a PhD in epidemiology He is the author or co-author of An Introduction to Medical Statistics now in its third edition and Statistical Questions in Evidence-based Medicine both Oxford University Press 190+ refereed journal articles reporting public health and clinical research and on research methods and with Prof Doug Altman the Statistics Notes series in the British Medical Journal He is currently working on clinical trials in wound care hazardous alcohol use depression irritable bowel syndrome and stroke prevention His personal research interests are in the design and analysis of studies of clinical measurement and of cluster randomised clinical trials His 1986 Lancet paper with Doug Altman on statistical methods for assessing agreement between two methods of clinical measurement has now been cited more than 13000 times and is the most cited paper ever to appear in the Lancet and has been reported to be the sixth most highly cited statistical paper ever

Martin presented a two-day satellite course in Auckland on 25-26 November on Cluster Randomised Trials

Thomas Lumley University of Washington

Thomas Lumley is an Associate Professor in the Biostatistics Department at the University of Washington in Seattle Thomas has accrued an impressive body of work and awards in a comparatively short amount of time Since completing his PhD in 1998 Thomas has published well over 100 peer reviewed articles in the leading journals of statistics biostatistics and the health sciences on theory methodology and application In addition he has given a substantial number of talks and workshops around the world In 2008 Thomas was awarded the Gertrude Cox Award for contributions to Statistical Practice Thomas is also a member of the R Core development team and his expertise in the field of statistical computing is recognised worldwide

5The International Biometric Society Australasian Region Conference

Keynote Speakers

Louise Ryan CSIRO

After 25 years as a faculty member in the Department of Biostatistics at the Harvard School of Public Health Louise Ryan returned to Australia earlier this year to join CSIRO (Commonwealth Scientific and Industrial Research Organisation) as Chief of the Division of Mathematics Informatics and Statistics (CMIS) Dr Ryan has a distinguished career in biostatistics having authored or co-authored over 200 papers in peer-reviewed journals Louise is a fellow of the American Statistical Association and the International Statistics Institute and is an elected member of the Institute of Medicine She has served in a variety of professional capacities including co-editor of Biometrics and President of the Eastern North American Region of the International Biometric Society She has served on advisory boards for several government agencies in the USA including the National Toxicology Program and the Environmental Protection Agency as well as several committees for the National Academy of Science She retains an adjunct professorship at Harvard

Chris Triggs University of Auckland

Chris Triggs is a Professor as well as being the current department head of Statistics at the University of Auckland New Zealand He has been a respected statistician for 30 years specializing in fields as diverse as experimental design and forensic science Professor Triggs has published more than 90 papers in a wide variety of statistical fields His research interests include experimental design population genetics and the application of statistical methods in many fields of science including forensic science and nutrigenomics He has lectured extensively in many of these subjects in Australasia Professor Triggs is an Associate Editor for Biometrics and is often called upon as referee for many other journals

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 2: here - Conferences @FOS - The University of Auckland

The International Biometric Society Australasian Region Conference

The International Biometric Society Australasian Region Conference

Biometrics on the Lake

29 November - 3 December 2009Taupo New Zealand

OrganisersThe conference is being organised by the following teams of people

LOCAL ORGANISING COMMITTEE (LOC)

Neil Cox neilcoxagresearchconz

Melissa Dobbie MelissaDobbiecsiroau

Harold Henderson haroldhendersonagresearchconz

Hans Hockey hansbiometricsmatterscom

Kathy Ruggiero (Chair) kruggieroaucklandacnz

SCIENTIFIC PROGRAM COMMITTEE (SPC)

David Baird davidvsnconz

James Curran (Chair) jcurranaucklandacnz

Graham Hepworth hepworthunimelbeduau

Hans Hockey hansbiometricsmatterscom

The International Biometric Society Australasian Region Conference

WELCOME TO BIOMETRICS ON THE LAKE

From the President of the region

I would like to give you a warm welcome to Biometrics on the Lake the 2009 conference of the Australasian Region of the International Biometric Society On behalf of the Local Organising Committee and the Scientific Program Committee I can say that it is great to have you here in Taupo to participate in this meeting which is sure to be another excellent event following in the footsteps of its predecessor at Coffs Harbour two years ago We are sure that you will enjoy both the scientific and social aspects of the conference With high quality keynote and invited speakers a great range of contributed talks and posters and the chance to mix with colleagues it will no doubt prove to be very beneficial professionally And you can look forward to an enjoyable social program in this superb part of the world

The organisers are delighted with the number of delegates at the conference ndash about 120 at the time of writing ndash in addition to accompanying persons whom we are very pleased to welcome as well Thirteen different countries are represented and we thank those from outside the region who have travelled large distances to get here Not surprisingly New Zealand has the largest number (about half of all delegates) but that wasnrsquot a hard one to win

These four days together provide a wonderful opportunity to renew friendships and professional contacts and forge new ones We look forward to catching up with many of you personally

Graham HepworthPresident Australasian RegionInternational Biometric Society

The International Biometric Society Australasian Region ConferenceThe International Biometric Society Australasian Region Conference

CONTENTSConference at a Glance 2

Keynote Speakers 4

Invited Speakers 6

General Information 7

Venue Information amp Map 8

Organised Social Activities 9

Sponsors 14

Venue Floor Plan 16

Conference Timetable 17

Oral Presentation Abstracts 25

Poster Presentation Abstracts 89

Index of Presenting Authors 102

Delegates List 107

2 The International Biometric Society Australasian Region Conference

CONFERENCE AT A GLANCEWelcome Reception - Sunday 29 Nov

A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

Keynote addresses

Louise Ryan (Monday 30 Nov) Martin Bland (Tuesday 1 Dec) Thomas Lumley (Wednesday 2 Dec) Chris Triggs (Thursday 3 Dec)

All keynote addresses begin at 9am and will be held in Swifts (see map of venue on page 16)

Invited Speakers

Ross Ihaka (1330 Monday 30 Nov) Alison Smith (1330 Wednesday 2 Dec)

Kaye Basford (1100 Thursday 3 Dec)

All invited speaker talks will be held in Swifts (see map of venue on page 16)

Organised Social Activities - Tuesday 1 Dec

This is a long-standing part of the conference program so keeping with tradition we have arranged four options for the afternoon of Tuesday 1 December after lunch We hope that you will find at least one of these activities attractive because we want you to relax get a breath of fresh air (or sulphur fumes at Orakei Korako) have fun and see some of what this part of New Zealand has to offer especially for first-time visitors These activities are optional and tickets need to be purchased for them through the conference organisers Preferences will be considered on a first come first served basis If you have queries about any of the social activities please contact Hans Hockey in the first instance An afternoon snack will be provided on the cruise and kayaking while the bus tour visits a cafe and the jet boating trip is too action-packed for eating

If you have not already registered for one of these activities please talk to someone at the registration desk to make arrangements before Tuesday morning Please note that all costs for the activities are in New Zealand Dollars

3The International Biometric Society Australasian Region Conference

Conference At A Glance

Conference dinner - Wednesday 2 Dec

To add some novelty to the conference program the conference dinner will be held at the Prawn Park restaurant home of Shawn the Prawn At the Prawn Park just 10 minutes drive north of Taupo on the Waikato River (see map on page 8) you will be able to compete at prawn fishing or the Killer Prawn Hole-in-One Golf Challenge take a guided tour of the nursery and hatchery enjoy fun and interactive water features and a glass of bubbly in the geothermal footbath as well as a sumptuous meal with breathtaking views at the riverside restaurant (on the patio weather permitting) Drinks (wine and non-alcoholic) will be provided and all dietary requirements can be catered for

Coaches have been arranged to transfer delegates to Huka Prawn Farm from the Suncourt Hotel leaving 6 pm with return trips at the conclusion of the event

Conference Prizes - Donated by CSIRO Mathematical and

Information Sciences

Prizes will be awarded for the best oral presentation and the best poster presentation by a young statistician as judged by a panel To be eligible for these awards the presenter must be a member of the IBS Australasian Region and be either a student (full-time or part-time) or a person who has graduated with a Bachelorrsquos Degree (in a biometrical-related field) within the last five years or a person awarded a Postgraduate Degree within the past year

4 The International Biometric Society Australasian Region Conference

KEYNOTE SPEAKERSMartin Bland University of York

Martin Bland joined the University of York as Professor of Health Statistics in 2003 Before this he spent 27 years at St Georgersquos Hospital Medical School University of London following posts at St Thomasrsquos Hospital Medical School and in industry with ICI He has a BSc in mathematics an MSc in Statistics and a PhD in epidemiology He is the author or co-author of An Introduction to Medical Statistics now in its third edition and Statistical Questions in Evidence-based Medicine both Oxford University Press 190+ refereed journal articles reporting public health and clinical research and on research methods and with Prof Doug Altman the Statistics Notes series in the British Medical Journal He is currently working on clinical trials in wound care hazardous alcohol use depression irritable bowel syndrome and stroke prevention His personal research interests are in the design and analysis of studies of clinical measurement and of cluster randomised clinical trials His 1986 Lancet paper with Doug Altman on statistical methods for assessing agreement between two methods of clinical measurement has now been cited more than 13000 times and is the most cited paper ever to appear in the Lancet and has been reported to be the sixth most highly cited statistical paper ever

Martin presented a two-day satellite course in Auckland on 25-26 November on Cluster Randomised Trials

Thomas Lumley University of Washington

Thomas Lumley is an Associate Professor in the Biostatistics Department at the University of Washington in Seattle Thomas has accrued an impressive body of work and awards in a comparatively short amount of time Since completing his PhD in 1998 Thomas has published well over 100 peer reviewed articles in the leading journals of statistics biostatistics and the health sciences on theory methodology and application In addition he has given a substantial number of talks and workshops around the world In 2008 Thomas was awarded the Gertrude Cox Award for contributions to Statistical Practice Thomas is also a member of the R Core development team and his expertise in the field of statistical computing is recognised worldwide

5The International Biometric Society Australasian Region Conference

Keynote Speakers

Louise Ryan CSIRO

After 25 years as a faculty member in the Department of Biostatistics at the Harvard School of Public Health Louise Ryan returned to Australia earlier this year to join CSIRO (Commonwealth Scientific and Industrial Research Organisation) as Chief of the Division of Mathematics Informatics and Statistics (CMIS) Dr Ryan has a distinguished career in biostatistics having authored or co-authored over 200 papers in peer-reviewed journals Louise is a fellow of the American Statistical Association and the International Statistics Institute and is an elected member of the Institute of Medicine She has served in a variety of professional capacities including co-editor of Biometrics and President of the Eastern North American Region of the International Biometric Society She has served on advisory boards for several government agencies in the USA including the National Toxicology Program and the Environmental Protection Agency as well as several committees for the National Academy of Science She retains an adjunct professorship at Harvard

Chris Triggs University of Auckland

Chris Triggs is a Professor as well as being the current department head of Statistics at the University of Auckland New Zealand He has been a respected statistician for 30 years specializing in fields as diverse as experimental design and forensic science Professor Triggs has published more than 90 papers in a wide variety of statistical fields His research interests include experimental design population genetics and the application of statistical methods in many fields of science including forensic science and nutrigenomics He has lectured extensively in many of these subjects in Australasia Professor Triggs is an Associate Editor for Biometrics and is often called upon as referee for many other journals

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 3: here - Conferences @FOS - The University of Auckland

The International Biometric Society Australasian Region Conference

WELCOME TO BIOMETRICS ON THE LAKE

From the President of the region

I would like to give you a warm welcome to Biometrics on the Lake the 2009 conference of the Australasian Region of the International Biometric Society On behalf of the Local Organising Committee and the Scientific Program Committee I can say that it is great to have you here in Taupo to participate in this meeting which is sure to be another excellent event following in the footsteps of its predecessor at Coffs Harbour two years ago We are sure that you will enjoy both the scientific and social aspects of the conference With high quality keynote and invited speakers a great range of contributed talks and posters and the chance to mix with colleagues it will no doubt prove to be very beneficial professionally And you can look forward to an enjoyable social program in this superb part of the world

The organisers are delighted with the number of delegates at the conference ndash about 120 at the time of writing ndash in addition to accompanying persons whom we are very pleased to welcome as well Thirteen different countries are represented and we thank those from outside the region who have travelled large distances to get here Not surprisingly New Zealand has the largest number (about half of all delegates) but that wasnrsquot a hard one to win

These four days together provide a wonderful opportunity to renew friendships and professional contacts and forge new ones We look forward to catching up with many of you personally

Graham HepworthPresident Australasian RegionInternational Biometric Society

The International Biometric Society Australasian Region ConferenceThe International Biometric Society Australasian Region Conference

CONTENTSConference at a Glance 2

Keynote Speakers 4

Invited Speakers 6

General Information 7

Venue Information amp Map 8

Organised Social Activities 9

Sponsors 14

Venue Floor Plan 16

Conference Timetable 17

Oral Presentation Abstracts 25

Poster Presentation Abstracts 89

Index of Presenting Authors 102

Delegates List 107

2 The International Biometric Society Australasian Region Conference

CONFERENCE AT A GLANCEWelcome Reception - Sunday 29 Nov

A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

Keynote addresses

Louise Ryan (Monday 30 Nov) Martin Bland (Tuesday 1 Dec) Thomas Lumley (Wednesday 2 Dec) Chris Triggs (Thursday 3 Dec)

All keynote addresses begin at 9am and will be held in Swifts (see map of venue on page 16)

Invited Speakers

Ross Ihaka (1330 Monday 30 Nov) Alison Smith (1330 Wednesday 2 Dec)

Kaye Basford (1100 Thursday 3 Dec)

All invited speaker talks will be held in Swifts (see map of venue on page 16)

Organised Social Activities - Tuesday 1 Dec

This is a long-standing part of the conference program so keeping with tradition we have arranged four options for the afternoon of Tuesday 1 December after lunch We hope that you will find at least one of these activities attractive because we want you to relax get a breath of fresh air (or sulphur fumes at Orakei Korako) have fun and see some of what this part of New Zealand has to offer especially for first-time visitors These activities are optional and tickets need to be purchased for them through the conference organisers Preferences will be considered on a first come first served basis If you have queries about any of the social activities please contact Hans Hockey in the first instance An afternoon snack will be provided on the cruise and kayaking while the bus tour visits a cafe and the jet boating trip is too action-packed for eating

If you have not already registered for one of these activities please talk to someone at the registration desk to make arrangements before Tuesday morning Please note that all costs for the activities are in New Zealand Dollars

3The International Biometric Society Australasian Region Conference

Conference At A Glance

Conference dinner - Wednesday 2 Dec

To add some novelty to the conference program the conference dinner will be held at the Prawn Park restaurant home of Shawn the Prawn At the Prawn Park just 10 minutes drive north of Taupo on the Waikato River (see map on page 8) you will be able to compete at prawn fishing or the Killer Prawn Hole-in-One Golf Challenge take a guided tour of the nursery and hatchery enjoy fun and interactive water features and a glass of bubbly in the geothermal footbath as well as a sumptuous meal with breathtaking views at the riverside restaurant (on the patio weather permitting) Drinks (wine and non-alcoholic) will be provided and all dietary requirements can be catered for

Coaches have been arranged to transfer delegates to Huka Prawn Farm from the Suncourt Hotel leaving 6 pm with return trips at the conclusion of the event

Conference Prizes - Donated by CSIRO Mathematical and

Information Sciences

Prizes will be awarded for the best oral presentation and the best poster presentation by a young statistician as judged by a panel To be eligible for these awards the presenter must be a member of the IBS Australasian Region and be either a student (full-time or part-time) or a person who has graduated with a Bachelorrsquos Degree (in a biometrical-related field) within the last five years or a person awarded a Postgraduate Degree within the past year

4 The International Biometric Society Australasian Region Conference

KEYNOTE SPEAKERSMartin Bland University of York

Martin Bland joined the University of York as Professor of Health Statistics in 2003 Before this he spent 27 years at St Georgersquos Hospital Medical School University of London following posts at St Thomasrsquos Hospital Medical School and in industry with ICI He has a BSc in mathematics an MSc in Statistics and a PhD in epidemiology He is the author or co-author of An Introduction to Medical Statistics now in its third edition and Statistical Questions in Evidence-based Medicine both Oxford University Press 190+ refereed journal articles reporting public health and clinical research and on research methods and with Prof Doug Altman the Statistics Notes series in the British Medical Journal He is currently working on clinical trials in wound care hazardous alcohol use depression irritable bowel syndrome and stroke prevention His personal research interests are in the design and analysis of studies of clinical measurement and of cluster randomised clinical trials His 1986 Lancet paper with Doug Altman on statistical methods for assessing agreement between two methods of clinical measurement has now been cited more than 13000 times and is the most cited paper ever to appear in the Lancet and has been reported to be the sixth most highly cited statistical paper ever

Martin presented a two-day satellite course in Auckland on 25-26 November on Cluster Randomised Trials

Thomas Lumley University of Washington

Thomas Lumley is an Associate Professor in the Biostatistics Department at the University of Washington in Seattle Thomas has accrued an impressive body of work and awards in a comparatively short amount of time Since completing his PhD in 1998 Thomas has published well over 100 peer reviewed articles in the leading journals of statistics biostatistics and the health sciences on theory methodology and application In addition he has given a substantial number of talks and workshops around the world In 2008 Thomas was awarded the Gertrude Cox Award for contributions to Statistical Practice Thomas is also a member of the R Core development team and his expertise in the field of statistical computing is recognised worldwide

5The International Biometric Society Australasian Region Conference

Keynote Speakers

Louise Ryan CSIRO

After 25 years as a faculty member in the Department of Biostatistics at the Harvard School of Public Health Louise Ryan returned to Australia earlier this year to join CSIRO (Commonwealth Scientific and Industrial Research Organisation) as Chief of the Division of Mathematics Informatics and Statistics (CMIS) Dr Ryan has a distinguished career in biostatistics having authored or co-authored over 200 papers in peer-reviewed journals Louise is a fellow of the American Statistical Association and the International Statistics Institute and is an elected member of the Institute of Medicine She has served in a variety of professional capacities including co-editor of Biometrics and President of the Eastern North American Region of the International Biometric Society She has served on advisory boards for several government agencies in the USA including the National Toxicology Program and the Environmental Protection Agency as well as several committees for the National Academy of Science She retains an adjunct professorship at Harvard

Chris Triggs University of Auckland

Chris Triggs is a Professor as well as being the current department head of Statistics at the University of Auckland New Zealand He has been a respected statistician for 30 years specializing in fields as diverse as experimental design and forensic science Professor Triggs has published more than 90 papers in a wide variety of statistical fields His research interests include experimental design population genetics and the application of statistical methods in many fields of science including forensic science and nutrigenomics He has lectured extensively in many of these subjects in Australasia Professor Triggs is an Associate Editor for Biometrics and is often called upon as referee for many other journals

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 4: here - Conferences @FOS - The University of Auckland

The International Biometric Society Australasian Region ConferenceThe International Biometric Society Australasian Region Conference

CONTENTSConference at a Glance 2

Keynote Speakers 4

Invited Speakers 6

General Information 7

Venue Information amp Map 8

Organised Social Activities 9

Sponsors 14

Venue Floor Plan 16

Conference Timetable 17

Oral Presentation Abstracts 25

Poster Presentation Abstracts 89

Index of Presenting Authors 102

Delegates List 107

2 The International Biometric Society Australasian Region Conference

CONFERENCE AT A GLANCEWelcome Reception - Sunday 29 Nov

A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

Keynote addresses

Louise Ryan (Monday 30 Nov) Martin Bland (Tuesday 1 Dec) Thomas Lumley (Wednesday 2 Dec) Chris Triggs (Thursday 3 Dec)

All keynote addresses begin at 9am and will be held in Swifts (see map of venue on page 16)

Invited Speakers

Ross Ihaka (1330 Monday 30 Nov) Alison Smith (1330 Wednesday 2 Dec)

Kaye Basford (1100 Thursday 3 Dec)

All invited speaker talks will be held in Swifts (see map of venue on page 16)

Organised Social Activities - Tuesday 1 Dec

This is a long-standing part of the conference program so keeping with tradition we have arranged four options for the afternoon of Tuesday 1 December after lunch We hope that you will find at least one of these activities attractive because we want you to relax get a breath of fresh air (or sulphur fumes at Orakei Korako) have fun and see some of what this part of New Zealand has to offer especially for first-time visitors These activities are optional and tickets need to be purchased for them through the conference organisers Preferences will be considered on a first come first served basis If you have queries about any of the social activities please contact Hans Hockey in the first instance An afternoon snack will be provided on the cruise and kayaking while the bus tour visits a cafe and the jet boating trip is too action-packed for eating

If you have not already registered for one of these activities please talk to someone at the registration desk to make arrangements before Tuesday morning Please note that all costs for the activities are in New Zealand Dollars

3The International Biometric Society Australasian Region Conference

Conference At A Glance

Conference dinner - Wednesday 2 Dec

To add some novelty to the conference program the conference dinner will be held at the Prawn Park restaurant home of Shawn the Prawn At the Prawn Park just 10 minutes drive north of Taupo on the Waikato River (see map on page 8) you will be able to compete at prawn fishing or the Killer Prawn Hole-in-One Golf Challenge take a guided tour of the nursery and hatchery enjoy fun and interactive water features and a glass of bubbly in the geothermal footbath as well as a sumptuous meal with breathtaking views at the riverside restaurant (on the patio weather permitting) Drinks (wine and non-alcoholic) will be provided and all dietary requirements can be catered for

Coaches have been arranged to transfer delegates to Huka Prawn Farm from the Suncourt Hotel leaving 6 pm with return trips at the conclusion of the event

Conference Prizes - Donated by CSIRO Mathematical and

Information Sciences

Prizes will be awarded for the best oral presentation and the best poster presentation by a young statistician as judged by a panel To be eligible for these awards the presenter must be a member of the IBS Australasian Region and be either a student (full-time or part-time) or a person who has graduated with a Bachelorrsquos Degree (in a biometrical-related field) within the last five years or a person awarded a Postgraduate Degree within the past year

4 The International Biometric Society Australasian Region Conference

KEYNOTE SPEAKERSMartin Bland University of York

Martin Bland joined the University of York as Professor of Health Statistics in 2003 Before this he spent 27 years at St Georgersquos Hospital Medical School University of London following posts at St Thomasrsquos Hospital Medical School and in industry with ICI He has a BSc in mathematics an MSc in Statistics and a PhD in epidemiology He is the author or co-author of An Introduction to Medical Statistics now in its third edition and Statistical Questions in Evidence-based Medicine both Oxford University Press 190+ refereed journal articles reporting public health and clinical research and on research methods and with Prof Doug Altman the Statistics Notes series in the British Medical Journal He is currently working on clinical trials in wound care hazardous alcohol use depression irritable bowel syndrome and stroke prevention His personal research interests are in the design and analysis of studies of clinical measurement and of cluster randomised clinical trials His 1986 Lancet paper with Doug Altman on statistical methods for assessing agreement between two methods of clinical measurement has now been cited more than 13000 times and is the most cited paper ever to appear in the Lancet and has been reported to be the sixth most highly cited statistical paper ever

Martin presented a two-day satellite course in Auckland on 25-26 November on Cluster Randomised Trials

Thomas Lumley University of Washington

Thomas Lumley is an Associate Professor in the Biostatistics Department at the University of Washington in Seattle Thomas has accrued an impressive body of work and awards in a comparatively short amount of time Since completing his PhD in 1998 Thomas has published well over 100 peer reviewed articles in the leading journals of statistics biostatistics and the health sciences on theory methodology and application In addition he has given a substantial number of talks and workshops around the world In 2008 Thomas was awarded the Gertrude Cox Award for contributions to Statistical Practice Thomas is also a member of the R Core development team and his expertise in the field of statistical computing is recognised worldwide

5The International Biometric Society Australasian Region Conference

Keynote Speakers

Louise Ryan CSIRO

After 25 years as a faculty member in the Department of Biostatistics at the Harvard School of Public Health Louise Ryan returned to Australia earlier this year to join CSIRO (Commonwealth Scientific and Industrial Research Organisation) as Chief of the Division of Mathematics Informatics and Statistics (CMIS) Dr Ryan has a distinguished career in biostatistics having authored or co-authored over 200 papers in peer-reviewed journals Louise is a fellow of the American Statistical Association and the International Statistics Institute and is an elected member of the Institute of Medicine She has served in a variety of professional capacities including co-editor of Biometrics and President of the Eastern North American Region of the International Biometric Society She has served on advisory boards for several government agencies in the USA including the National Toxicology Program and the Environmental Protection Agency as well as several committees for the National Academy of Science She retains an adjunct professorship at Harvard

Chris Triggs University of Auckland

Chris Triggs is a Professor as well as being the current department head of Statistics at the University of Auckland New Zealand He has been a respected statistician for 30 years specializing in fields as diverse as experimental design and forensic science Professor Triggs has published more than 90 papers in a wide variety of statistical fields His research interests include experimental design population genetics and the application of statistical methods in many fields of science including forensic science and nutrigenomics He has lectured extensively in many of these subjects in Australasia Professor Triggs is an Associate Editor for Biometrics and is often called upon as referee for many other journals

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 5: here - Conferences @FOS - The University of Auckland

2 The International Biometric Society Australasian Region Conference

CONFERENCE AT A GLANCEWelcome Reception - Sunday 29 Nov

A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

Keynote addresses

Louise Ryan (Monday 30 Nov) Martin Bland (Tuesday 1 Dec) Thomas Lumley (Wednesday 2 Dec) Chris Triggs (Thursday 3 Dec)

All keynote addresses begin at 9am and will be held in Swifts (see map of venue on page 16)

Invited Speakers

Ross Ihaka (1330 Monday 30 Nov) Alison Smith (1330 Wednesday 2 Dec)

Kaye Basford (1100 Thursday 3 Dec)

All invited speaker talks will be held in Swifts (see map of venue on page 16)

Organised Social Activities - Tuesday 1 Dec

This is a long-standing part of the conference program so keeping with tradition we have arranged four options for the afternoon of Tuesday 1 December after lunch We hope that you will find at least one of these activities attractive because we want you to relax get a breath of fresh air (or sulphur fumes at Orakei Korako) have fun and see some of what this part of New Zealand has to offer especially for first-time visitors These activities are optional and tickets need to be purchased for them through the conference organisers Preferences will be considered on a first come first served basis If you have queries about any of the social activities please contact Hans Hockey in the first instance An afternoon snack will be provided on the cruise and kayaking while the bus tour visits a cafe and the jet boating trip is too action-packed for eating

If you have not already registered for one of these activities please talk to someone at the registration desk to make arrangements before Tuesday morning Please note that all costs for the activities are in New Zealand Dollars

3The International Biometric Society Australasian Region Conference

Conference At A Glance

Conference dinner - Wednesday 2 Dec

To add some novelty to the conference program the conference dinner will be held at the Prawn Park restaurant home of Shawn the Prawn At the Prawn Park just 10 minutes drive north of Taupo on the Waikato River (see map on page 8) you will be able to compete at prawn fishing or the Killer Prawn Hole-in-One Golf Challenge take a guided tour of the nursery and hatchery enjoy fun and interactive water features and a glass of bubbly in the geothermal footbath as well as a sumptuous meal with breathtaking views at the riverside restaurant (on the patio weather permitting) Drinks (wine and non-alcoholic) will be provided and all dietary requirements can be catered for

Coaches have been arranged to transfer delegates to Huka Prawn Farm from the Suncourt Hotel leaving 6 pm with return trips at the conclusion of the event

Conference Prizes - Donated by CSIRO Mathematical and

Information Sciences

Prizes will be awarded for the best oral presentation and the best poster presentation by a young statistician as judged by a panel To be eligible for these awards the presenter must be a member of the IBS Australasian Region and be either a student (full-time or part-time) or a person who has graduated with a Bachelorrsquos Degree (in a biometrical-related field) within the last five years or a person awarded a Postgraduate Degree within the past year

4 The International Biometric Society Australasian Region Conference

KEYNOTE SPEAKERSMartin Bland University of York

Martin Bland joined the University of York as Professor of Health Statistics in 2003 Before this he spent 27 years at St Georgersquos Hospital Medical School University of London following posts at St Thomasrsquos Hospital Medical School and in industry with ICI He has a BSc in mathematics an MSc in Statistics and a PhD in epidemiology He is the author or co-author of An Introduction to Medical Statistics now in its third edition and Statistical Questions in Evidence-based Medicine both Oxford University Press 190+ refereed journal articles reporting public health and clinical research and on research methods and with Prof Doug Altman the Statistics Notes series in the British Medical Journal He is currently working on clinical trials in wound care hazardous alcohol use depression irritable bowel syndrome and stroke prevention His personal research interests are in the design and analysis of studies of clinical measurement and of cluster randomised clinical trials His 1986 Lancet paper with Doug Altman on statistical methods for assessing agreement between two methods of clinical measurement has now been cited more than 13000 times and is the most cited paper ever to appear in the Lancet and has been reported to be the sixth most highly cited statistical paper ever

Martin presented a two-day satellite course in Auckland on 25-26 November on Cluster Randomised Trials

Thomas Lumley University of Washington

Thomas Lumley is an Associate Professor in the Biostatistics Department at the University of Washington in Seattle Thomas has accrued an impressive body of work and awards in a comparatively short amount of time Since completing his PhD in 1998 Thomas has published well over 100 peer reviewed articles in the leading journals of statistics biostatistics and the health sciences on theory methodology and application In addition he has given a substantial number of talks and workshops around the world In 2008 Thomas was awarded the Gertrude Cox Award for contributions to Statistical Practice Thomas is also a member of the R Core development team and his expertise in the field of statistical computing is recognised worldwide

5The International Biometric Society Australasian Region Conference

Keynote Speakers

Louise Ryan CSIRO

After 25 years as a faculty member in the Department of Biostatistics at the Harvard School of Public Health Louise Ryan returned to Australia earlier this year to join CSIRO (Commonwealth Scientific and Industrial Research Organisation) as Chief of the Division of Mathematics Informatics and Statistics (CMIS) Dr Ryan has a distinguished career in biostatistics having authored or co-authored over 200 papers in peer-reviewed journals Louise is a fellow of the American Statistical Association and the International Statistics Institute and is an elected member of the Institute of Medicine She has served in a variety of professional capacities including co-editor of Biometrics and President of the Eastern North American Region of the International Biometric Society She has served on advisory boards for several government agencies in the USA including the National Toxicology Program and the Environmental Protection Agency as well as several committees for the National Academy of Science She retains an adjunct professorship at Harvard

Chris Triggs University of Auckland

Chris Triggs is a Professor as well as being the current department head of Statistics at the University of Auckland New Zealand He has been a respected statistician for 30 years specializing in fields as diverse as experimental design and forensic science Professor Triggs has published more than 90 papers in a wide variety of statistical fields His research interests include experimental design population genetics and the application of statistical methods in many fields of science including forensic science and nutrigenomics He has lectured extensively in many of these subjects in Australasia Professor Triggs is an Associate Editor for Biometrics and is often called upon as referee for many other journals

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 6: here - Conferences @FOS - The University of Auckland

3The International Biometric Society Australasian Region Conference

Conference At A Glance

Conference dinner - Wednesday 2 Dec

To add some novelty to the conference program the conference dinner will be held at the Prawn Park restaurant home of Shawn the Prawn At the Prawn Park just 10 minutes drive north of Taupo on the Waikato River (see map on page 8) you will be able to compete at prawn fishing or the Killer Prawn Hole-in-One Golf Challenge take a guided tour of the nursery and hatchery enjoy fun and interactive water features and a glass of bubbly in the geothermal footbath as well as a sumptuous meal with breathtaking views at the riverside restaurant (on the patio weather permitting) Drinks (wine and non-alcoholic) will be provided and all dietary requirements can be catered for

Coaches have been arranged to transfer delegates to Huka Prawn Farm from the Suncourt Hotel leaving 6 pm with return trips at the conclusion of the event

Conference Prizes - Donated by CSIRO Mathematical and

Information Sciences

Prizes will be awarded for the best oral presentation and the best poster presentation by a young statistician as judged by a panel To be eligible for these awards the presenter must be a member of the IBS Australasian Region and be either a student (full-time or part-time) or a person who has graduated with a Bachelorrsquos Degree (in a biometrical-related field) within the last five years or a person awarded a Postgraduate Degree within the past year

4 The International Biometric Society Australasian Region Conference

KEYNOTE SPEAKERSMartin Bland University of York

Martin Bland joined the University of York as Professor of Health Statistics in 2003 Before this he spent 27 years at St Georgersquos Hospital Medical School University of London following posts at St Thomasrsquos Hospital Medical School and in industry with ICI He has a BSc in mathematics an MSc in Statistics and a PhD in epidemiology He is the author or co-author of An Introduction to Medical Statistics now in its third edition and Statistical Questions in Evidence-based Medicine both Oxford University Press 190+ refereed journal articles reporting public health and clinical research and on research methods and with Prof Doug Altman the Statistics Notes series in the British Medical Journal He is currently working on clinical trials in wound care hazardous alcohol use depression irritable bowel syndrome and stroke prevention His personal research interests are in the design and analysis of studies of clinical measurement and of cluster randomised clinical trials His 1986 Lancet paper with Doug Altman on statistical methods for assessing agreement between two methods of clinical measurement has now been cited more than 13000 times and is the most cited paper ever to appear in the Lancet and has been reported to be the sixth most highly cited statistical paper ever

Martin presented a two-day satellite course in Auckland on 25-26 November on Cluster Randomised Trials

Thomas Lumley University of Washington

Thomas Lumley is an Associate Professor in the Biostatistics Department at the University of Washington in Seattle Thomas has accrued an impressive body of work and awards in a comparatively short amount of time Since completing his PhD in 1998 Thomas has published well over 100 peer reviewed articles in the leading journals of statistics biostatistics and the health sciences on theory methodology and application In addition he has given a substantial number of talks and workshops around the world In 2008 Thomas was awarded the Gertrude Cox Award for contributions to Statistical Practice Thomas is also a member of the R Core development team and his expertise in the field of statistical computing is recognised worldwide

5The International Biometric Society Australasian Region Conference

Keynote Speakers

Louise Ryan CSIRO

After 25 years as a faculty member in the Department of Biostatistics at the Harvard School of Public Health Louise Ryan returned to Australia earlier this year to join CSIRO (Commonwealth Scientific and Industrial Research Organisation) as Chief of the Division of Mathematics Informatics and Statistics (CMIS) Dr Ryan has a distinguished career in biostatistics having authored or co-authored over 200 papers in peer-reviewed journals Louise is a fellow of the American Statistical Association and the International Statistics Institute and is an elected member of the Institute of Medicine She has served in a variety of professional capacities including co-editor of Biometrics and President of the Eastern North American Region of the International Biometric Society She has served on advisory boards for several government agencies in the USA including the National Toxicology Program and the Environmental Protection Agency as well as several committees for the National Academy of Science She retains an adjunct professorship at Harvard

Chris Triggs University of Auckland

Chris Triggs is a Professor as well as being the current department head of Statistics at the University of Auckland New Zealand He has been a respected statistician for 30 years specializing in fields as diverse as experimental design and forensic science Professor Triggs has published more than 90 papers in a wide variety of statistical fields His research interests include experimental design population genetics and the application of statistical methods in many fields of science including forensic science and nutrigenomics He has lectured extensively in many of these subjects in Australasia Professor Triggs is an Associate Editor for Biometrics and is often called upon as referee for many other journals

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 7: here - Conferences @FOS - The University of Auckland

4 The International Biometric Society Australasian Region Conference

KEYNOTE SPEAKERSMartin Bland University of York

Martin Bland joined the University of York as Professor of Health Statistics in 2003 Before this he spent 27 years at St Georgersquos Hospital Medical School University of London following posts at St Thomasrsquos Hospital Medical School and in industry with ICI He has a BSc in mathematics an MSc in Statistics and a PhD in epidemiology He is the author or co-author of An Introduction to Medical Statistics now in its third edition and Statistical Questions in Evidence-based Medicine both Oxford University Press 190+ refereed journal articles reporting public health and clinical research and on research methods and with Prof Doug Altman the Statistics Notes series in the British Medical Journal He is currently working on clinical trials in wound care hazardous alcohol use depression irritable bowel syndrome and stroke prevention His personal research interests are in the design and analysis of studies of clinical measurement and of cluster randomised clinical trials His 1986 Lancet paper with Doug Altman on statistical methods for assessing agreement between two methods of clinical measurement has now been cited more than 13000 times and is the most cited paper ever to appear in the Lancet and has been reported to be the sixth most highly cited statistical paper ever

Martin presented a two-day satellite course in Auckland on 25-26 November on Cluster Randomised Trials

Thomas Lumley University of Washington

Thomas Lumley is an Associate Professor in the Biostatistics Department at the University of Washington in Seattle Thomas has accrued an impressive body of work and awards in a comparatively short amount of time Since completing his PhD in 1998 Thomas has published well over 100 peer reviewed articles in the leading journals of statistics biostatistics and the health sciences on theory methodology and application In addition he has given a substantial number of talks and workshops around the world In 2008 Thomas was awarded the Gertrude Cox Award for contributions to Statistical Practice Thomas is also a member of the R Core development team and his expertise in the field of statistical computing is recognised worldwide

5The International Biometric Society Australasian Region Conference

Keynote Speakers

Louise Ryan CSIRO

After 25 years as a faculty member in the Department of Biostatistics at the Harvard School of Public Health Louise Ryan returned to Australia earlier this year to join CSIRO (Commonwealth Scientific and Industrial Research Organisation) as Chief of the Division of Mathematics Informatics and Statistics (CMIS) Dr Ryan has a distinguished career in biostatistics having authored or co-authored over 200 papers in peer-reviewed journals Louise is a fellow of the American Statistical Association and the International Statistics Institute and is an elected member of the Institute of Medicine She has served in a variety of professional capacities including co-editor of Biometrics and President of the Eastern North American Region of the International Biometric Society She has served on advisory boards for several government agencies in the USA including the National Toxicology Program and the Environmental Protection Agency as well as several committees for the National Academy of Science She retains an adjunct professorship at Harvard

Chris Triggs University of Auckland

Chris Triggs is a Professor as well as being the current department head of Statistics at the University of Auckland New Zealand He has been a respected statistician for 30 years specializing in fields as diverse as experimental design and forensic science Professor Triggs has published more than 90 papers in a wide variety of statistical fields His research interests include experimental design population genetics and the application of statistical methods in many fields of science including forensic science and nutrigenomics He has lectured extensively in many of these subjects in Australasia Professor Triggs is an Associate Editor for Biometrics and is often called upon as referee for many other journals

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 8: here - Conferences @FOS - The University of Auckland

5The International Biometric Society Australasian Region Conference

Keynote Speakers

Louise Ryan CSIRO

After 25 years as a faculty member in the Department of Biostatistics at the Harvard School of Public Health Louise Ryan returned to Australia earlier this year to join CSIRO (Commonwealth Scientific and Industrial Research Organisation) as Chief of the Division of Mathematics Informatics and Statistics (CMIS) Dr Ryan has a distinguished career in biostatistics having authored or co-authored over 200 papers in peer-reviewed journals Louise is a fellow of the American Statistical Association and the International Statistics Institute and is an elected member of the Institute of Medicine She has served in a variety of professional capacities including co-editor of Biometrics and President of the Eastern North American Region of the International Biometric Society She has served on advisory boards for several government agencies in the USA including the National Toxicology Program and the Environmental Protection Agency as well as several committees for the National Academy of Science She retains an adjunct professorship at Harvard

Chris Triggs University of Auckland

Chris Triggs is a Professor as well as being the current department head of Statistics at the University of Auckland New Zealand He has been a respected statistician for 30 years specializing in fields as diverse as experimental design and forensic science Professor Triggs has published more than 90 papers in a wide variety of statistical fields His research interests include experimental design population genetics and the application of statistical methods in many fields of science including forensic science and nutrigenomics He has lectured extensively in many of these subjects in Australasia Professor Triggs is an Associate Editor for Biometrics and is often called upon as referee for many other journals

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 9: here - Conferences @FOS - The University of Auckland

6 The International Biometric Society Australasian Region Conference

INVITED SPEAKERSRoss Ihaka University of Auckland

Ross Ihaka is Associate Professor of Statistics at the University of Auckland He is recognized as one of the originators of the R programming language In 2008 he received the Royal Society of New Zealandrsquos Pickering Medal for his work on R

Kaye Basford University of Queensland

Kaye Basford is Head of the School of Land Crop and Food Sciences at the University of Queensland Her work in biometrics focuses on the analysis and interpretation of data from large-scale multi-environment plant breeding experiments in particular using a pattern analysis approach Kaye is currently IBS Vice-President in advance of her Presidential term 2010-11

Alison Smith NSW Department of Industry and Investment

Alison Smith is at the Wagga Wagga Agricultural Institute in the NSW Department of Industry and Investment (formerly Primary Industries) Biometrics Unit where she works on and researches methodology for plant breeding multi-environment variety trials plant quality trait experiments micro-array data and outlier detection in linear mixed models

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 10: here - Conferences @FOS - The University of Auckland

7The International Biometric Society Australasian Region Conference

GENERAL INFORMATIONName TagsPlease wear your name badge at all times during the conference and at social events

Mobile PhonesAs a courtesy to presenters and colleagues please ensure that your mobile phone is switched off during the conference sessions

Conference CateringLunches Morning and Afternoon Teas will be served at the lsquoChill on Northcroftrsquo Restaurant (see venue floor plan on page 16)

Conference DinnerTickets are required for the Conference Dinner If you have misplaced or did not receive tickets at registration or wish to purchase additional tickets please see one of the conference organisers at the registration desk

Transport has been arranged in coaches to transfer delegates to dinner from the Suncourt Hotel amp Conference Centre leaving 6 pm with return trips at the conclusion of the event

Welcome reception (Sunday 29 November)A welcome reception will be held from 6 to 8 pm on Sunday 29 November on the Chill restaurant decks overlooking the lake and mountains at the Suncourt Hotel and Conference Centre Drinks (beer house wine soft drinks and orange juice) and a selection of hot and cold hors drsquooeuvres will be on offer

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 11: here - Conferences @FOS - The University of Auckland

8 The International Biometric Society Australasian Region Conference

VENUE INFORMATION amp MAPVenueThe Biometrics 2009 meeting will be held at the Suncourt Hotel and Conference Centre on the lakefront in Taupo New Zealand It has uninterrupted views of the stunning great Lake Taupo with a backdrop of the majestic volcanoes Mt Ruapehu Mt Tongariro and Mt Ngauruhoe

Suncourt Hotel Suncourt Hotel and Conference Centre is rated Qualmark 4 Star and is perfectly situated in the centre of Taupo The lake town centre boat harbour cafeacutes and night-life are all only a stroll away

Driving directions to Huka Prawn FarmHead west on Northcroft Street toward Titiraupenga Street (02km)Turn left at Titiraupenga Street (31m)Turn right at Lake Tce (05km)(or alternatively go up to Heuheu Street then onto Tongariro Street)Continue onto Tongariro Street (11km - go through one roundabout)Continue onto SH 1 SH5 (10km)Turn right at Huka Falls Road and continue past Huka Falls and Loop Road (Note that Huka Falls Road becomes Karetoto Road)Take the sign-posted right just past Helistar and continue straight past Honey Hive to end of Karetoto Road

A Suncourt Hotel amp Conference Centre14 Northcroft Street Taupo(07) 378 8265wwwsuncourtconz

B Huka Prawn FarmHuka Falls RoadWairakei Park Taupo

9The International Biometric Society Australasian Region Conference

ORGANISED SOCIAL ACTIVITIES Conferences can be intense and lead to ldquobrain strainrdquo for some so relief from the scientific program is often welcome and necessary for recharging ones batteries With this in mind the LOC have organised some great social events that will hopefully afford you the opportunity to meet new faces and catch up with your biometrical colleagues The social activities planned for Tuesday afternoon provide an opportunity for delegates to relax and get out and about in the beautiful Lake Taupo region

Please note that the costs of the welcome reception and the conference dinner are included in the registration fee for a conference delegate (IBS member non-member or student) attending the whole week Additional tickets may be purchased for day registrants and partners who are interested in attending either of these events

Young Statisticiansrsquo Night - Monday 30 NovThis social event is for young statisticians to get together in an informalrelaxing atmosphere so you can share your research and meet your possible future colleagues As long as you consider yourself as a ldquoyoung statisticianbiometricianrdquo you are welcome to attend this event We will meet at 6pm at the Chill restaurant deck and either stroll into town together or organise a maxicab (sharing the cost) to head for a hot swim then meal at the Terraces Hotel (80-100 Napier Taupo Highway Taupo Tel (07) 378-7080)

Please let Sammie Jia (yilinjiaplantandfoodconz) know if you are interested in attending this event

Other Organised Social Activities- Tuesday 1 Dec

1 Cruise on Lake Taupo with optional extras

With Chris Jolly Outdoors smell the coffee brewing as you board the Waikare II take a seat and enjoy all the comforts as you cruise to the Maori Rock Carvings A full commentary is given on the history of Lake Taupo as you cruise out of the Taupo marina Take in the fabulous views of secluded bays and the spectacular mountain peaks of Tongariro National Park The sights are amazing all year round Afternoon tea is included as part of your charter and tea or coffee are complimentary throughout the cruise There are also full bar facilities

Fishing for and hopefully eating rainbow or brown trout is included in the charter although to meet licence requirements only four clients can be nominated to actually land the catch Only 4 lines can be put out at a time on downriggers If successful any catch can be barbequed or sashimied and served and shared onboard - there

10 The International Biometric Society Australasian Region Conference

Organised Social Activities

is nothing like freshly caught trout There is also the alternative of having the fish smoked and vacuum packed - there are extra charges for this The trout could also be taken back to your accommodation where the chef could perhaps prepare them as part of one of your dinners (In New Zealand trout as a game fish cannot be sold so you wonrsquot see it on restaurant menus) Or you may wish to cook it yourself in the Suncourt Hotelrsquos barbeque area or elsewhere You may also wish to do clay bird shooting The cost is $180 per shot

Time 230 pm charter departure allowing time to walk to the Boat Harbour after lunch returning about 530 pm to berthWhere Boat harbourmarina at mouth of the Waikato River at north end of lake frontTake Swimwear including towel if you want an invigorating deep water swim off the launch Donrsquot forget to take your camera as some of the scenery can only be seen from on the waterCost $70 per person based on a three hour scenic charter including fishing with clay bird shooting extra at $180 per shotNotes For this activity to proceed at this cost we require a minimum of 10 persons

2 Kayaking down the Waikato River

This trip starts from the source of the mighty Waikato River and the kayaking finishes with a small rapid at Reidrsquos Farm Kayak Course Then a short shuttle ride takes you to the renowned Huka Falls from where you are able to walk back up river to Spa Park

It is a gentle paddle down through a scenic waterway with opportunities to see people bungy jump from underneath the platform and to soak in some natural thermal streams en route It is a great trip for the ldquofirst timerdquo kayaker of any age or for those wanting the most peaceful scenic cruise available in Taupo Exiting the river there is time for a snack on the riverbank before you are transported to the Huka Falls where you may wish to find the track accessing their base You will have plenty of time for sightseeing and the walk back to back to Spa Park

11The International Biometric Society Australasian Region Conference

Organised Social Activities

Leaving the gushing sounds of the mesmerizing Falls you cut though a leafy regrowth to on a beautiful scenic river track The track passes through exotic and regenerating forest and climbs away from the river to afford good views of cliffs mid-channel islands and the Huka Falls Lodge on the far bank There are some scenic lookouts along the way where you can take in the full glory of the Majestic Waikato river As you near Spa Park the track winds back down towards the river and you emerge from the bush to see the beautiful natural hot pools that are great for soothing your weary feet The track is graded as an easy walk and should take about one hour but you have longer before pick up at a pre-arranged time to return to your residence

So bring your swimwear if you want to enjoy a relaxing soak at the thermal streams on the kayak down or on the walk back

Time Pickup from Suncourt Hotel at 130 pm return around 600 pmTake Swimwear towel outdoors shoes towel sunscreen hat and camera (waterproof case may be handy)Cost $50 per personNotes This activity requires a minimum of 4 people to proceed Keen mountain bikers may wish to contact Hans Hockey for their options

3 Jetboating geothermal and nature - Orakei Karako or the Squeeze

The New Zealand RiverJet experience offers conference delegates one of two options to enhance this thrilling jet boating experience In both cases the Riverjet adventure will introduce you to a slice of New Zealand scenery only a select few have ever seen Tour through great pine forests and open farmlands that line the river bank along with some of New Zealandrsquos most beautiful unspoilt native bush The sleek black boat takes you through the magnificent Tutukau Gorge where canyon walls rise 50 dramatic metres above the magnificent Waikato River on the way to the hidden valley of Orakei Karako possibly the best thermal area in New Zealand

12 The International Biometric Society Australasian Region Conference

Organised Social Activities

In both cases thermal activity will be seen from the boat but option 1 ($155) allows one hourrsquos entry to the thermal wonderland for a close up look at gushing geysers hot springs boiling mud pools the awesome Aladdinrsquos cave and some of the largest silica terraces in the world

While the park visitors are on land option 2 ($140) whisks swimmers away to the Squeeze You will disembark the boat in knee deep warm water After manoeuvring your way through narrow crevasses climbing boulders and wading through waist-deep warm water you emerge in stunning native New Zealand bush Immerse yourself in the environment and take the opportunity to soak up the atmosphere while relaxing in the thermal waters of a naturally heated bathing pool

Then the groups rejoin for the thrilling return trip giving a total trip time of about three hours This is the only thrilling jet boat ride in New Zealand that incorporates a geothermal wonderland experience

Time Transport departs Suncourt Hotel at 130 pm returns at approximately 530 pmTake Appropriate footwear for option 1 swimwear and towel for option 2 Donrsquot forget to take your camera as some of the scenery can only be seen from the river Tie on your hats You may have time to eat your own snack during option 1Cost $155 pp for option 1 including park admission $140 pp for option 2 both options including transportNotes For this activity to proceed we require a minimum of only 4 people in total as the same boat is used

13The International Biometric Society Australasian Region Conference

Organised Social Activities

4 Art Attractions and Wine Tasting Tour

Guided return bus transport to Tauporsquos most visited natural and other attractions which will include the following a visit to Aratiatia Rapids in time to see the release of water from the upstream dam viewing the geothermal borefield from where electricity is generated a short walk amongst the geothermal features at Craters of the Moon a viewing of the mighty Huka Falls from both sides of the river a visit to Acacia Bayrsquos LrsquoArte which has a interesting mosaic and sculpture garden with a gallery and cafe concluding with drop off at Scenic Cellars where they have an enomatic wine tasting system and you can choose from up to 32 wines a selection of both NZ and international wines This is on the lakefront edge of the CBDrsquos restaurant and nightlife area five minutersquos walk from the Suncourt Hotel

Time Tour coach with guide departs Suncourt Hotel at 130 pm promptly terminates at the nearby Scenic Cellars at approximately 530 pmTake Cafe snack is not included but all entry fees are Donrsquot forget to take your cameraCost $70 per personNotes For this activity to proceed we require a minimum of 8 people The maximum limit is 22

14 The International Biometric Society Australasian Region Conference

SPONSORSThe International Biometric Society Australasian Region gratefully acknowledges the financial support of this conference from the following organisations in particular SAS as premier sponsor

SAS Business Intelligence Software and Predictive Analytics

Roche Australia

CSIRO Mathematics Informatics and Statistics

Statistical Data Analysis | VSN International | Software for Bioscientists

Department of Statistics The University of Auckland

School of Biological Sciences The University of Auckland

15The International Biometric Society Australasian Region Conference

Sponsors

AgResearch - Bioinformatics Mathematics and Statistics

Department of Mathematics and Statistics The University of Melbourne

Biometrics Matters Limited

Hoare Research Software

OfficeMax

16 The International Biometric Society Australasian Region Conference

VENUE FLOOR PLAN

1 Boardroom For all Board session presentations2 Swifts For keynote addresses invited speaker talks and all Swifts

sessions 3 BathroomsToilets4 lsquoChill on Northcroftrsquo Restaurant All morningafternoon teas and

lunches will be provided here5 Gullivers Computer room with two internet access desktops6 Lems Registration desk location and further desk space and

power points for wireless internet access

2

1

3

4

5

6

17The International Biometric Society Australasian Region Conference

CONFERENCE TIMETABLE

SUNDAY 29TH NOV1600 Conference Registration opens1800 Welcome Reception

Dinner (own arrangement)

MONDAY 30TH NOV850 Presidential Opening (Swifts)

Graham Hepworth University of Melbourne900 Keynote Address (Swifts)

Louise Ryan CSIRO Mathematics Informatics and StatisticsQuantifying uncertainty in risk assessmentChair Graham Hepworth

950

-1030

Session 1 Swifts Medical

Chair John Field

Session 1 Boardroom Ecological ModellingChair Teresa Neeman

950 Sub site evaluation of prognostic factors of colon and rectal cancers using competing risks survival analysis approachMohamad Asghari Tarbiat Modares University

Spatial modelling of prawn abundance from large-scale marine surveys using penalised regression splinesCharis Burridge CSIRO Mathematics Informatics and Statistics

1010 Personalised medicine endovascular aneurysm repair risk assessment model using preoperative variablesMary Barnes CSIRO Mathematics Informatics and Statistics

Rank regression for analyzing environmental dataYou-Gan Wang CSIRO Mathematics Informatics and Statistics

1030 Morning Tea (30 minutes)1100

-1220

Session 2 Swifts Modelling

Chair Andrew McLachlan

Session 2 Boardroom Environmental amp Methods

Chair Zaneta Park1100 Introduction to Quantile

regressionDavid Baird VSN NZ Ltd

Capture recapture estimation using finite mixtures of arbitrary dimension Richard Arnold Victoria University

18 The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1120 Incorporating study

characteristics in the modelling of associations across studiesElizabeth Stojanovski University of Newcastle

The effect of a GnRH vaccine GonaCon on the growth of juvenile tammar wallabiesRobert Forrester ANU

1140 A comparison of matrices of time series with application in dendroclimatologyMaryanne Pirie University of Auckland

Model based grouping of species across environmental gradientsRoss Darnell CSIRO Mathematics Informatics and Statistics

1200 How SAS and R integrateMichael Graham SAS Auckland

The use of the chi-square test when observations are dependentAustina Clark University of Otago

1220 Lunch (1 hour 10 minutes)

1330 Invited Speaker (Swifts) Ross Ihaka University of AucklandWriting Efficient Programs in R and BeyondChair Renate Meyer

1410

-1510

Session 3 Swifts Variance

Chair Geoff Jones

Session 3 Boardroom Genetics

Chair John Koolaard

1410 Variance estimation for systematic designs in spatial surveysRachel Fewster University of Auckland

Developing modules in genepattern for gene expression analysisMarcus Davy Plant and Food Research

1430 Variance components analysis for balanced and unbalanced data in reliability of gait measurementMohammadreza Mohebbi Monash University

High dimensional QTL analysis within complex linear mixed modelsJulian Taylor CSIRO Mathematics Informatics and Statistics

1450 Modernizing AMOVA using ANOVAHwan-Jin Yoon ANU

Correlation of transcriptomic and phenotypic data in dairy cowsZaneta Park AgResearch

1510 Afternoon Tea (30 minutes)

19The International Biometric Society Australasian Region Conference

Conference Timetable

MONDAY 30TH NOV1540

-1700

Session 4 Swifts Modelling

Chair Mario DrsquoAntuono

Session 4 Boardroom Ecology

Chair Rachel Fewster1540 Non-inferiority margins in

clinical trialsSimon Day Roche Products Ltd

Visualising model selection criteria for presence and absence data in ecology Samuel Mueller University of Sydney

1600 Data processing using Excel with RAndrew McLachlan Plant and Food Research Lincoln

Estimating weights for constructing composite environmental indicesRoss Darnell CSIRO Mathematics Informatics and Statistics

1620 Investigating covariate effects on BDD infection with longitudinal data Geoffrey Jones Massey University

A spatial design for monitoring the health of a large-scale freshwater river systemMelissa Dobbie CSIRO Mathematics Informatics and Statistics

1640 Statistical modelling of intrauterine growth for FilipinosVincente Balinas University of the Philippines Visayas

Backfitting estimation of a response surface modelJhoanne Marsh C Gatpatan University of the Philippines Visayas

1700 Poster SessionChair Melissa Dobbie

1800 Dinner (own arrangement)

20 The International Biometric Society Australasian Region Conference

Conference Timetable

TUESDAY 1ST DEC900 Keynote Address (Swifts)

Martin Bland University of YorkClustering by treatment provider in randomised trialsChair Simon Day

950

-1030

Session 1 Swifts Missing Data

Chair Vanessa Cave

Session 1 Boardroom Count Data

Chair Hwan-Jin Yoon950 The future of missing data

Herbet Thijs Hasselt University

A strategy for modelling count data which may have extra zerosAlan Welsh ANU

1010 Application of latent class with random effects models to longitudinal dataKen Beath Macquarie University

A reliable constrained method for identity link Poisson regressionIan Marschner Macquarie University

1030 Morning TeaIBS Biennial General Meeting (60 minutes)

1130

-1230

Session 2 Swifts Medical

Chair Hans Hockey

Session 2 Boardroom Modelling

Chair Olena Kravchuk1130 Multivariate response

models for global health-related quality of lifeAnnette Kifley Macquarie University

Building a more stable predictive logistic regression modelAnna Campain University of Sydney

1150 Estimation of optimal dynamic treatment regimes from longitudinal observational dataLiliana Orellana Universidad de Buenos Aires

Stepwise paring down variation for identifying influential multifactor interactionsJing-Shiang Hwang Academia Sinica

1210 Parametric conditional frailty models for recurrent cardiovascular events in the lipid studyJisheng Cui Deakin University

Empirical likelihood estimation of a diagnostic test likelihood ratioDavid Matthews University of Waterloo

1230 Lunch (1 hour)1330 Organised Social Activities

1800 Dinner (own arrangement)

21The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC900 Keynote Address (Swifts)

Thomas Lumley University of WashingtonUsing the whole cohort in analysis of subsampled data Chair Alan Welsh

950

-1030

Session 1 Swifts Clinical Trials

Chair Ian Marschner

Session 1 Boardroom Fisheries

Chair Charis Burridge950 Adjusting for nonresponse in

case-control studiesAlastair Scott University of Auckland

An exploratory analysis of the effects of sampling in marine surveys for biodiversity estimationHideyasu Shimadzu GeoScience Australia

1010 Correcting for measurement error in reporting of episodically consumed foods when estimating diet-disease associationsVictor Kipnis USA National Cancer Institute

On the 2008 World Fly Fishing ChampionshipsThomas Yee University of Auckland

1030 Morning Tea (30 minutes)

1100

-1220

Session 2 Swifts Medical Models

Chair Katrina Poppe

Session 2 Boardroom AgricultureHorticulture

Chair Emlyn Williams

1100 Relative risk estimation in randomised controlled trials a comparison of methods for independent observationsLisa Yelland University of Adelaide

Habitat usage of extensively farmed red deer hinds in response to environmental factors over calving and lactationRoger Littlejohn AgResearch

1120 Multiple stage procedures in covariate-adjusted response-adaptive designsEunsik Park Chonnam National University

Some statistical approaches in estimating lambing ratesMario DrsquoAntuono Dept of Agriculture WA

22 The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1140 Potential outcomes and

propensity score methods for hospital performance comparisonsPatrick Graham University of Otago

FTIR analysis associations with induction and release of kiwifruit buds from dormancyDenny Meyer Swinburne University of Technology

1200 Local odds ratio estimation for multiple response contingency tablesIvy Liu Victoria University

Non-linear mixed-effects modelling for a soil temperature studyPauline Ding ANU

1220 Lunch (1 hour 10 minutes)1330 Invited Speaker (Swifts)

Alison Smith NSW Department of Industry and InvestmentEmbedded partially replicated designs for grain quality testingChair David Baird

1410

-1510

Session 3 Swifts Design

Chair Ross Darnell

Session 3 Boardroom Functional AnalysisChair Marcus Davy

1410 Spatial models for plant breeding trialsEmlyn Williams ANU

Can functional data analysis be used to develop a new measure of global cardiac functionKatrina Poppe University of Auckland

1430 A two-phase design for a high-throughput proteomics experimentKevin Chang University of Auckland

Variable penalty dynamic warping for aligning GC-MS dataDavid Clifford CSIRO

1450 Shrinking sea-urchins in a high CO2 world a two-phase experimental designKathy Ruggiero University of Auckland

A model for the enzymatically 18O-labeled MALDI-TOF mass spectraTomasz Burzykowski Hasslet University

1510 Afternoon Tea (30 minutes)

23The International Biometric Society Australasian Region Conference

Conference Timetable

WEDNESDAY 2ND DEC1540

-1700

Session 4 Swifts Methods

Chair David Clifford

Session 4 Boardroom Mixtures amp Classification

Chair Thomas Yee1540 High-dimensional multiple

hypothesis testing with dependenceSandy Clarke University of Melbourne

On estimation of nonsingular normal mixture densitiesMichael Stewart University of Sydney

1600 Metropolis-Hastings algorithms with adaptive proposalsRenate Meyer University of Auckland

Estimation of finite mixtures with nonparametric componentsChew-Seng Chee University of Auckland

1620 Bayesian inference for multinomial probabilities with non-unique cell classification and sparse dataNokuthaba Sibanda Victoria University

Classification techniques for class imbalance dataSiva Ganesh Massey University

1640 Filtering in high dimension dynamic systems using copulasJonathon Briggs University of Auckland

Comparison of the performance of QDF with that of the discriminant function (AEDC) based on absolute deviation from the meanSelvanayagam Ganesalingam Massey University

1800 Conference Dinner

24 The International Biometric Society Australasian Region Conference

Conference Timetable

THURSDAY 3RD DEC900 Keynote Address (Swifts)

Chris Triggs University of AucklandNutrigenomics - a source of new statistical challengesChair Ruth Butler

950

-1030

Session 1 Swifts Genetics

Chair Ken Dodds

Session 1 Boardroom Ecology

Chair Duncan Hedderley950 Combination of clinical and

genetic markers to improve cancer prognosisKim-Anh Le Cao University of Queensland

A multivariate feast among bandicoots at Heirisson ProngTeresa Neeman ANU

1010 Effective population size estimation using linkage disequilibrium and diffusion approximationJing Liu University of Auckland

Environmental impact assessments a statistical encounterDave Saville Saville Statistical Consulting Ltd

1030 Morning Tea (30 minutes)1100 Invited Speaker (Swifts)

Kaye Basford University of QueenslandOrdination of marker-trait association profiles from long-term international wheat trialsChair Lyn Hunt

1140

-1220

Session 2 Swifts Medical

Chair Ken Beath

Session 2 Boardroom Genetics

Chair Julian Taylor1140 Finding best linear

combination of markers for a medical diagnostic with restricted false positive rateYuan-chin Chang Academia Sinica

Believing in magic validation of a novel experimental breeding designEmma Huang CSIRO Mathematics Informatics and Statistics

1200 A modified combination test for the analysis of clinical trialsMarkus Neuhaumluser Rhein Ahr Campus

Phenotypes for training and validation of whole genome selection methodsKen Dodds AgResearch

1220 Closing Remarks1230 Lunch1300 Conference Concludes

25The International Biometric Society Australasian Region Conference

ORAL PRESENTATION ABSTRACTS

MONDAY 30TH NOV

900 Keynote Address (Swifts) Louise Ryan CSIRO Mathematics Informatics and StatisticsChair Graham Hepworth

QUANTIFYING UNCERTAINTY IN RISK ASSESSMENT

Louise RyanCSIRO Mathematics Informatics and Statistics

E-mail LouiseRyancsiroau

Risk assessment is used in a variety of different fields to quantify the chance of various adverse or catastrophic events Such quantifications can then be used to inform policy (eg environmental standards) or to attach appropriate monetary value (eg insurance industry) However risk assessment is fraught with uncertainty due to model choice data availability or underlying science of the problem being studied This presentation will focus on some of the strategies that can be applied to quantify the impact of uncertainty and illustrate them with several examples from the environmental arena

26 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOVSession 1 Swifts Medical Chair John Field

SUB SITE EVALUATION OF PROGNOSTIC FACTORS OF COLON AND RECTAL CANCERS USING COMPETING

RISKS SURVIVAL ANALYSIS APPROACH

Mohamad Asghari1 Ebrahim Hajizadeh1 Anoshirvan Kazemnejad1 and Seyed Reza Fatemi2

1Department of Biostatistics Faculty of Medical Sciences Tarbiat Modares University Tehran Iran

2Shahid Beheshti University of Medical Sciences Gastrointestinal Research Center Tehran Iran

E-mail masghari862gmailcom

Colorectal Cancer (CRC) is one the most malignant cancer through the world and it varies because the different effect of risk factors in different part of the world Also knowing the risk factors of the cancer has clinical importance for prognosis and treatment application However evaluation of the risk factors of the cancer as a whole would not provide thorough understanding of the cancer Therefore the aim of this study was to determine specific risk factors of colon and rectum cancers via a competing risks survival analysis A total of 1219 patients with CRC diagnosis according to the pathology report of cancer registry of RCGLD from 1 January 2002 to 1 October 2007 were entered into the study Data were analyzed using univariate and multivariate competing risk survival analysis utilizing Stata statistical software The results confirms gender alcohol history IBD and tumor grade as specific risk factors of colon cancer and hypertension opium and personal history as specific risk factors of rectum cancer Also BMI and pathologic stage of the cancer are common risk factors of both types of cancers Based our findings CRC is not a single entity and colon and rectum cancers should be evaluated specifically to reveal the hidden associations which may not be revealed under general modeling These findings could provide more information for prognosis and treatment therapy and possible application of screening programs specifically for colon and rectum carcinomas

27The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PERSONALISED MEDICINE ENDOVASCULAR ANEURYSM REPAIR RISK ASSESSMENT MODEL USING

PREOPERATIVE VARIABLES

Mary Barnes1 Robert Fitridge2 and Maggi Boult2

1CSIRO Australia Mathematics Informatics and Statistics Glen Osmond

South Australia2Department of Surgery University of Adelaide the Queen Elizabeth Hospital

Adelaide South Australia

E-mail MaryBarnescsiroau

An Endovascular Aneurysm Repair Risk Assessment (ERA) model was developed as an aid for surgeons and patients to decide treatment options The ERA model predicts the likely risk associated with endovascular aneurysm repair based on a patientrsquos pre-operative variables The ERA model was based on data from an Australian audit of nearly 1000 patients who underwent the procedure over an 18 month period between 1999 and 2001 and whose outcomes were followed for more than five years

The ERA Model is available at the following website (wwwhealthadelaideeduausurgeryevar) The ERA model enables clinicians to enter up to eight preoperative variables for a patient in order to derive the predicted likelihood of primary endpoints such as early death aneurysm-related death survival type I endoleaks and mid-term re-interventions Secondary endpoints predicted include technical and clinical success type II endoleaks graft complications migration rupture and conversion to open repair The eight pre-operative variables are age at operation American Society of Anaesthesiologists rating gender aneurysm diameter creatinine aortic neck angle infrarenal neck length and infrarenal neck diameter Stepwise forward regressions (binomial model with logit link) were used to select which of the preoperative patient variables were included in each success measure risk model

The ERA Model was internally validated on Australian data using bootstrapping [1] Recently it has been externally validated using specialist UK vascular institute data Despite UK patients being sicker (plt0001) having larger aneurysms (plt0001) and being more likely to die (plt005) than the Australian patients the ERA model fitted UK data better for the risk factors- early death aneurysm-related death three-year survival and type I endoleaks as evidenced by higher area under ROC curves andor higher R2

The ERA Model appears to be robust Further external validation and improvements to the model will occur within a recently approved NHMRC grant

1 Barnes (2008 Eur J Vasc Endovasc Surg 35571-579)

28 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

MONDAY 30TH NOV Session 1 Boardroom Ecological ModellingChair Teresa Neeman

SPATIAL MODELLING OF PRAWN ABUNDANCE FROM LARGE-SCALE MARINE SURVEYS USING PENALISED

REGRESSION SPLINES

Charis Burridge1 Geoff Laslett1 and Rob Kenyon2

1CSIRO Mathematics Informatics and Statistics2CSIRO Marine and Atmospheric Research

E-mail charisburridgecsiroau

Since the 1970s CSIRO has been closely involved in assessing the status of prawn stocks in the Northern Prawn Fishery (NPF) fitting population dynamics models to commercial catch data and conducting prawn ecology research Between 1975 and 1992 there were three survey series each covering a limited region However there was no ongoing monitoring of multiple regions due to the high cost of conducting such surveys and the apparent wealth of data from commercial logbooks In 2001 an international review of the management process for this fishery recommended that annual multi-species fishery-independent survey be introduced Bi-annual surveys of prawn distribution and abundance started in August 2002 300 trawls for the recruitment survey in late summer and 200 trawls for adults in winter Locations were randomly selected from areas of long-standing fishing effort We fitted penalised regression splines to the density of several commercial prawn species using an MCMC approach implemented in BayesX (httpwwwstatuni-muenchende~bayesx) Some adaptation was needed in order to allocate knots to the subset of the 300000 sqkm Gulf of Carpentaria represented by the survey The Bayesian approach leads straightforwardly to mean density estimates with credible interval for each region as well as the entire survey area We compare this approach with more routine design-based estimates and bootstrapped confidence intervals

29The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

RANK REGRESSION FOR ANALYZING ENVIRONMENTAL DATA

You-Gan Wang1 and Liya Fu2

1CSIRO Mathematics Informatics and Statistics Australia2School of Mathematics and Statistics Northeast Normal University China

E-mail you-ganwangcsiroau

We investigate the rank regression for environmental data analysis Rank regression is robust and has been found to be more natural when substantial proportions of the observations are below detection limits (censored) and more efficient when errors have heavy-tailed distributions To alleviate computational burden we apply the induced smoothing method which provides both regression parameter estimates and their covariance matrices after a few iterations Datasets from a few environmental studies will be analyzed for illustration

30 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 -1220

MONDAY 30TH NOVSession 2 Swifts ModellingChair Andrew McLachlan

AN INTRODUCTION TO QUANTILE REGRESSION

David Baird11VSN NZ Limited

E-mail davidvsnconz

Quantile regression is a technique that fits models to the quantiles of the data for example the median rather than to the mean Rather than minimizing the usual sum of squares for the median we minimize the sum of absolute deviances In general for quantile Q we minimize Se(Q ndash I (e lt 0)) where I is a 0 1 indicator function and e is the model residual Quantile regression does not make any assumptions about the distribution of the data Rather than just a single curve being fitted to the data a set of curves representing different quantiles can be fitted Fitting the model is done using linear programming Inference is made using approximate methods or bootstrapping Quantile regression can be extended from linear models to splines loess and non-linear models Quantile regression is available in GenStat R and SAS This talk will give a simple ground up explanation of the methods and some applications of quantile regression

Roger Koenker 2005 Quantile Regression Cambridge University Press

31The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INCORPORATING STUDY CHARACTERISTICS IN THE MODELLING OF ASSOCIATIONS ACROSS STUDIES

Elizabeth Stojanovski1 Junaidi Sutan1 Darfiana Nur1 and Kerrie Mengersen2

1School of MAPS University of Newcastle2Queensland University of Technology

E-mail ElizabethStojanovskinewcastleeduau

Associations between the presence of p16INK4a alteration and a poorer prognosis for Ewingrsquos sarcomas using 2-year survival as the outcome will be investigated Information regarding these associations is however limited and often conflicting This may be partly attributed to differences between studies which can be considered sources of statistical heterogeneity

The purpose of a recent meta-analysis conducted by Honoki et al [2007] was to identify studies that examined the association between p16INK4a status and two-year survival This study was based on six studies representing 188 patients which met the inclusion criteria and were pooled in the meta-analysis The presence of p16INK4a alteration was found to be a statistically significant predictor of Ewingrsquos sarcoma prognosis (estimated pooled risk ratio 217 95 confidence interval 155-303)

In the present study a random-effects Bayesian meta-analysis model is conducted to combine the reported estimates of the selected studies by allowing major sources of variation to be taken into account study level characteristics between and within study variance Initially the observed risk ratios are assumed random samples from study-specific true ratios which are themselves assumed distributed around an overall ratio In the second model there are hierarchical levels between the study-specific parameters and the overall distribution The latter model can thus accommodate partial exchangeability between studies acknowledging that some studies are more similar due to common designs locations and so on Analysis was undertaken in WinBUGS

The results of this meta-analysis support an overall statistically significant association between the presence of p16INK4a alteration and Ewingrsquos sarcoma By allowing for differences in study design this strengthens the findings of the former study

32 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A COMPARISON OF MATRICES OF TIME SERIES WITH APPLICATION IN DENDROCLIMATOLOGY

Maryann Pirie11Department of Statistics and School of Geography Geology and Environment

University of Auckland

E-mail mpir007aucklanduniacnz

All trees produce annual growth rings The widths of these rings depend on whether the climatic conditions during the growing season were favourable or not Kauri (Agathis australis) are widespread over the northern part of the North Island of New Zealand resulting in an abundance of material spanning thousands of years Therefore kauri has a strong potential as a source for inferring past climates

Kauri tree ring widths have been used to reconstruct the activeness of ENSO events over the past 2000 years However there are concerns that the observed patterns are a result of a possible failure of the uniformitarianism principle This is because the responses of kauri to the common climate signal in a particular year may be influenced by the size of the tree and hence that this change in response could affect the observed trends in reconstructed ENSO activity Therefore the dataset containing time series of ring width indices for each core was divided into two subsets

1 The portion of the series produced when the trees were small and

2 The portion of the series produced when the trees were large

Thus each of these subsets are ragged arrays of time series The question of interest is ldquoHow can these two ragged arrays be comparedrdquo

We discuss a method to examine whether two ragged arrays are similar or different and to allow specific time periods of differencesimilarity to be identified

33The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HOW SAS AND R INTEGRATE

Michael Graham11Analytics - SAS ANZ

E-mail MichaelGrahamsascom

Many organizations relying on SAS also use R which offers some a way to experiment with new cutting-edge methods Others find Rrsquos open-source nature appealing For most both SAS and R are impor-tant tools for discovering fact-based answers

This presentation will give an overview of how SAS and R can work together and plans for future integration

1100 - 1220

MONDAY 30TH NOV Session 2 Boardroom Environmental amp MethodsChair Zaneta Park

CAPTURE RECAPTURE ESTIMATION USING FINITE MIXTURES OF ARBITRARY DIMENSION

Richard Arnold1 Yu Hayakawa2 and Paul Yip3

1Victoria University of Wellington NZ2Waseda University Japan

3Hong Kong University

E-mail richardarnoldmsorvuwacnz

In capture-recapture experiments samples of individuals from a finite population are observed on a number of separate sampling occasions with the goal of estimating the size of the population Crucial to the calculation of reliable estimates from such data is the appropriate modelling of the heterogeneity both among individuals and between sampling occasions

In this paper we use Reversible Jump MCMC to model both sources of heterogeneity and their interaction using finite mixtures RJMCMC automates the selection of models with differing numbers of components and with or without an interaction term To demonstrate the method we analyse a reliability testing data set

34 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE EFFECT OF A GNRH VACCINE GONACON ON THE GROWTH OF JUVENILE TAMMAR WALLABIES

Robert Forrester1 Melissa Snape2 and Lyn Hinds2

1Statistical Consulting Unit ANU2Invasive Animals CRC

E-mail BobForresteranueduau

Vaccination against gonadotrophin releasing hormone (GnRH) disrupts hormonal regulation of reproduction including sexual development GonaConTM is a GnRH vaccine which has been shown to be effective in a number of eutherian (placental) mammals but as yet has not been tested in marsupials Thirty five juvenile tammar wallabies received one of three treatments sham vaccination (Control) a single vaccination of GonaConTM (Vac1) or a single vaccination of GonaConTM followed by a booster (Vac2) Growth measurements on the animals were taken on 18 occasions at irregular intervals over the next 115 weeks Of particular interest was whether there is any difference between the animals that receive the single or boosted vaccination

The data are analysed using repeated measures methods to assess the long term effects of the vaccination Since the data are unequally spaced in time this restricts the number of possible options available Some approaches are explored and the differences between the results examined

35The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODEL BASED GROUPING OF SPECIES ACROSS ENVIRONMENTAL GRADIENTS

Ross Darnell1 Piers Dunstan2 and Scott Foster1

1CSIRO Mathematics Informatics and Statistics2CSIRO Wealth from Ocean Flagship

E-mail RossDarnellcsiroau

We present a novel approach to the statistical analysis and prediction of multispecies data The approach allows the simultaneous grouping and quantification of multiple speciesrsquo responses to environmental gradients The underlying statistical model is a finite mixture model where mixing is performed over the individual speciesrsquo modelled responses Species with similar responses to the environment are grouped with minimal information loss We term these groups species-archetypes Each species-archetype has an associated GLM model that allows prediction with appropriate measures of uncertainty We have illustrated the concept and method using an artificial data We used the method to model the probability of presence of 200 species from the Great Barrier Reef lagoon on 13 oceanographic and geological gradients from 12 to 24 degrees S The 200 species appear to be well represented by 15 species-archetypes The model is interpreted through maps of the probability of presence for a fine scale set of locations throughout the study area Maps of uncertainty are also produced to provide statistical context The probability of presence of each species-archetypes was strongly influenced by oceanographic gradients principally temperature oxygen and salinity The number of species in each cluster ranged from 4 to 34 The method has potential application to the analysis of multispecies distribution patterns and for multispecies management

36 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THE USE OF THE CHI-SQUARE TEST WHEN OBSERVATIONS ARE DEPENDENT

Austina S S Clark11University of Otago

E-mail aclarkmathsotagoacnz

When the Chi-square test is applied to test the association between two multinomial distributions each with cells we usually assume that cell observations are independent If some of the cells are dependent we would like to investigate (1) how to implement the Chi-square test and (2) how to find the test statistics and the associated degrees of freedom The test statistics and degrees of freedom are developed from results by Geisser S amp Greenhouse S W (1958 JEBS 69-82) and Huynh H amp Feldt L S (1976 AMS 885-891) We will use an example of influenza symptoms of two groups of patients to illustrate this method One group of patients suffered from H1N1 influenza 09 and the other from seasonal influenza There were twelve symptoms collected for each patient and these symptoms were not totally independent

1330 MONDAY 30TH NOVInvited Speaker (Swifts)Ross Ihaka University of AucklandChair Renate Meyer

WRITING EFFICIENT PROGRAMS IN R AND BEYOND

Ross Ihaka1 Duncan Temple Lang2 and Brendan McArdle1

1University of Auckland NZ2University of California Davis US

E-mail ihakastataucklandacnz

Many of the efficiency issues that arise when using R result from basic design decisions made when the software was created Some major problems result from an early decision to provide S compatibility In this talk wersquoll use examples to illustrate some of efficiency problems which occur when using R and try to give some guidance in how to overcome them Wersquoll also examine how the design of future systems could be changed to overcome Rrsquos efficiency problems Such design changes will necessarily make these systems look different from R Wersquoll try to show the nature of these differences

37The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOV Session 3 Swifts VarianceChair Geoff Jones

VARIANCE ESTIMATION FOR SYSTEMATIC DESIGNS IN SPATIAL SURVEYS

Rachel Fewster11Department of Statistics University of Auckland

E-mail rfewsteraucklandacnz

Spatial surveys to estimate the density of objects in a survey region often use systematic designs to reduce the variance of the density estimator However estimating the reduced variance is well-known to be a difficult problem If variance is estimated without taking account of the systematic design the gain in reducing the variance can be lost by overestimating it The usual approaches to estimating a systematic variance are to approximate the systematic design by a random design approximate it by a stratified design or to model the correlation between population units Approximation by a random design can perform very poorly while approximation by a stratified design is an improvement but can still be severely biased in some situations I will describe a new estimator based on modeling the correlation in encounters between samplers that are close in space The new estimator has negligible bias in all situations tested and produces a strong improvement in estimator precision The estimator is applied to surveys of spotted hyena in the Serengeti National Park Tanzania for which it produces a dramatic change in the reported coefficient of variation

38 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIANCE COMPONENTS ANALYSIS FOR BALANCED AND UNBALANCED DATA IN RELIABILITY OF GAIT

MEASUREMENT

Mohammadreza Mohebbi1 2 Rory Wolfe1 2 Jennifer McGinley2 Pamela Simpson1 2 Pamela Murphy1 2 and Richard Baker2

1Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia2NHMRC and CCRE in Gait Analysis Hugh Williamson Gait Laboratory Royal

Childrenrsquos Hospital and the Murdoch Childrenrsquos Research Institute Melbourne

Australia

E-mail MRMmohebbiyahoocom

BackgroundAim Gait measurements are routinely used in clinical gait analysis and provide a key outcome measure for gait research and clinical practice This study proposes methods for design analysis and interpretation of reliability studies of gait measurement

Method Balanced and unbalanced experimental designs are described for reliability studies Within subject within assessor and within session component errors can be estimated at each point of the gait cycle or average across the gait cycle We present variance component methods for this purpose Guidelines for calculating sufficient sample size for balanced designs are also provided

Results Application of the methods was illustrated in examples from a gait measurement study

Conclusion Variance component methods are useful tools in analysing reliability data but care should be taken in design and analysis of these studies

39The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MODERNIZING AMOVA USING ANOVA

Hwan-Jin Yoon1 Terry Neeman1 and Emlyn Williams1

1Statistical Consulting Unit Australian National University

E-mail hwan-jinyoonanueduau

The analysis of molecular variance (AMOVA) is a methodology for assessing genetic diversity between regions and among populations within a region using molecular data such as restriction fragment length polymorphisms (RLFPs) or microsatellites The effectiveness and usefulness of AMOVA in detecting between and within region genetic variation have been widely demonstrated The initial framework for AMOVA was defined by Cockerham (1963 1973) AMOVA is essentially a hierarchical analysis of variance (ANOVA) as it partitions genetic variation by tiers individuals within a population populations within a region and among regions To do AMOVA special genetics packages are required such as Arlequin and GenAlex

Using fungus microsatellite data we show that analysis of variance layouts for AMOVA and ANOVA are the same Variance components using REML and AMOVA are compared General statistical packages in which ANOVA and REML are standard methods may be preferred to the special genetics packages for AMOVA as they offer the advan-tages of REML and allow greater flexibility in model specification

40 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

MONDAY 30TH NOVSession 3 Boardroom GeneticsChair John Koolaard

DEVELOPING MODULES IN GENEPATTERN FOR GENE EXPRESSION ANALYSIS

Marcus Davy1 and Mik Black2

1Plant and Food Research2University of Otago

E-mail marcusdavyplantandfoodconz

GenePattern was developed at the Broad Institute (MIT) and provides a framework for developing modules using common programming languages such as Java MATLAB and R which can be integrated within a web-based interface for the analysis of data from microarray experiments (both expression and SNP arrays) As part of the REANNZ funded project lsquoIntegrated Genomics Resources for Health and Diseasersquo a local installation of GenePattern has been established in the Department of Biochemistry at the University of Otago and is accessible via the Kiwi Advanced Research and Education Network (KAREN) This talk will discuss using GenePattern for developing computational genomics modules and some basic functionality will be demonstrated In particular we will highlight analysis modules that have been developed locally and are being made available via the GenePattern server located at Otago

41The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

HIGH DIMENSIONAL QTL ANALYSIS WITHIN COMPLEX LINEAR MIXED MODELS

Julian Taylor1 and Ari Verbyla12

1CMIS CSIRO2Adelaide University

E-mail juliantaylorcsiroau

There has been a recent focus on variable selection methods in the biosciences to help to understand the influence of underlying genetics on traits of interest In the plant breeding context this involves the analysis of Quantitative Trait Loci (QTLs) from traits measured in complex designed experiments Due to the nature of these experiments extra components of variation such as spatial trends and extraneous environmental variation needs to be accommodated and can be achieved using linear mixed models However with these models the inclusion of an additional high dimensional genetic component becomes problematic This talk discusses the incorporation of high dimensional genetic variable selection in a mixed model framework The methodology shows that this incorporation can be achieved in a natural way even when the number of genetic variables exceeds the number of observations This method is then applied to wheat quality traits and a well established genetic wheat map of 411 markers obtained from the Future Grains group in the Food Futures Flagship in CSIRO This example focusses on the simultaneous incorporation and selection of up to 75000 genetic variables (main QTL effects and epistatic interactions) for some wheat quality traits of interest The results show possibly for the first time that QTL epistatic interactions are obtainable for traits measured in a highly complex designed experiment

42 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRELATION OF TRANSCRIPTOMIC AND PHENOTYPIC DATA IN DAIRY COWS

Zaneta Park1 David Pacheco1 Neil Cox1 Alan McCulloch1 Richard Spelman2 and Sue McCoard1

1AgResearch2LiveStock Improvement Corporation

E-mail zanetapark-ngagresearchconz

The relationship between genes and phenotype in dairy cows is both biologically interesting and of vast economic importance to New Zealand This presentation describes the analysis of a unique set of dairy cow data obtained by Livestock Improvement Corporation comprising Affymetrix transcriptomic data for 24k genes for both liver and fat samples in gt250 dairy cows and associated phenotypic data (milk yield protein casein and total solids percentage and yield and growth hormone IGF and Insulin levels) This data is highly valuable as few such large datasets incorporating both genetic and phenotypic data currently exist for dairy cows in New Zealand

The data was analysed in R initially using a simple correlation approach to detect significant associations between gene expression and phenotype for all 750k possible combinations This was followed by fitting a mixed effects regression with sire as a random term To ensure only quality gene expression data was used transcriptomic values which failed to reach a ldquoPresentrdquo call when applying a Wilcoxon signed rank test were excluded before analysis as were obvious phenotypic outliers Fold changes in gene expression between the 10th and 90th percentile phenotypic values were also calculated

Practicalities re the handling of such large datasets are also described including the use of drop-down boxes and the MATCH and OFFSET functions in Excel to easily plot any gene expression-phenotype combination

43The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Swifts ModellingChair Mario DrsquoAntuono

NON-INFERIORITY MARGINS IN CLINICAL TRIALS

Simon Day1

1Roche Products Ltd

E-mail simondayRochecom

In non-inferiority trials it is well-established practice that a non-inferiority margin needs to be set before the study is conducted The conclusion of the study is then based on comparing the data to that pre-stated margin Different opinions exist on how such margins should be determined Some are highly statistical some are based much more on clinical judgement some are based on absolute differences between treatments some on relative differences There is little consensus across the medical scientific and clinical trials communities on how small such margins should be or even on what the principles for setting margins should be

In a superiority study although we may carry out a significance test of the null hypothesis of zero difference we base decisions about using a treatment on far wider criteria As a minimum we consider the size of the benefit and the size of any adverse effects Rejecting the null hypothesis with a suitably small P-value is not enough to persuade us to use a new treatment nor is failing to reject the null hypothesis necessarily enough to dissuade us from using the treatment

So why is the success of a non-inferiority study based on beating a pre-specified non-inferiority margin to which not everyone would agree anyway Well it isnrsquot - or it shouldnrsquot be The entire dataset should be considered A decision should be based on factors including how large the benefit is (or how small it is in relation to an existing treatment) what the adverse effects are convenience of using the medication and perhaps individual patientsrsquo willingness to take it No single pre-defined margin can ever be sufficient At best margin-setting can be a helpful planning tool but it has less value in assessing the results of a study and little value in therapeutic decision making

44 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

DATA PROCESSING USING EXCEL WITH R

Andrew McLachlan11Plant and Food Research

E-mail AndrewMcLachlanplantandfoodconz

In a study of the effects of different ingredients and recipes on food structure several physical tests were performed on food samples For each sample these texture analysis and rheological methods generated many data points which were plotted as curves Summarising these curves usually involves finding points of interest such as peaks or troughs and points of maximum slope which is often done subjectively by eye alone I describe an Excel-based system using Excel macros and R (via RExcel) that enabled researchers to more objectively identify points of interest and to process large numbers of sample results quickly

45The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

INVESTIGATING COVARIATE EFFECTS ON BDD INFECTION WITH LONGITUDINAL DATA

Geoffrey Jones1 Daan Vinke2 and Wes Johnson3

1IFS Massey University NZ2Epicentre Massey University NZ

3Department of Statistics UC Irvine USA

E-mail gjonesmasseyacnz

Bovine digital dermatitis (BDD) is defined by typical epidermal lesions on the feet of infected cows It is considered to be a leading cause of infectious lameness BDD lesions tend to be highly painful hence BDD has been identified as a major welfare concern Although economic impacts are difficult to quantify these are thought to be substantial

The only recognized diagnostic test for BDD is foot inspection which is a time-- and labour--intensive procedure Recently an enzyme-linked immunosorbent assay (ELISA) that measures the level of serum antibodies has been developed using lesion-associated Treponema spp isolates The ELISA test applied to blood samples from individual animals is more convenient to carry out than foot inspection and is potentially a more sensitive indicator of infection

A prospective cohort study was carried out on four farms in Cheshire UK to investigate temporal patterns in BDD infection using both foot inspection and the ELISA test Earlier studies had suggested a seasonal pattern as well as dependency on the age and foot hygiene of the individual cow Interestingly our results show seasonality in lesion status but not in the serology score

Here we treat lesion status and serology score as imperfect tests of infection status which we model as an autocorrelated latent binary process Covariate effects can enter in various ways into this model We adopt a parsimonious approach using MCMC for parameter estimation and for predicting the infection history of individual cows

46 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STATISTICAL MODELLING OF INTRAUTERINE GROWTH FOR FILIPINOS

Vicente Balinas1 and Francis Saison2

1University of the Philippines Visayas Miag-ao Iloilo Philippines2Western Visayas Medical Center Iloilo City Philippines

E-mail vtbalyahoocom

This is a cross-sectional study of 5248 low-risk pregnant women with ultrasound measurement of biparietal diameter (BPD) head circumference (HC) abdominal circumference (AC) and femur length (FL) from 20 weeks to 42 weeks age of gestation Objective The objective of this study is to construct intrauterine growth chart for Filipinos Methods Regression analyses for each measurement were fitted using the quadratic model as a function of gestational age The percentiles (10th 50th and 90th) were calculated and plotted against age of gestation The Filipino growth curves were similar to previous studies that rate of growth of each fetal parameter were almost linear from 20 weeks to 30-34 weeks where the rate of growth slows down until the end of pregnancy Results The standard Filipino growth curves of BPD HC AC and FL were compared to the standard growth curves from Japanese and Chitty data The Filipino babies were smaller compared to the data from Chitty and larger compared to the Japanese data The results of the 2 comparative evaluation of growth curve supported the notion that growth of different population differs Conclusion The standard growth curve for Filipinos should be used during the assessment of fetal growth for Filipino babies

Keywords Fetal growth biparietal diameter head circumference abdominal circumference femur length

47The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

MONDAY 30TH NOVSession 4 Boardroom EcologyChair Rachel Fewster

VISUALISING MODEL SELECTION CRITERIA FOR PRES-ENCE AND ABSENCE DATA IN ECOLOGY

Samuel Mueller1 Katherine D Tuft2 and Clare McArthur2

1School of Mathematics and Statistics University of Sydney Australia2School of Biological Sciences University of Sydney Australia

E-mail muellermathsusydeduau

Useful diagrams for the better understanding of models that are optimal in the sense of the AIC the BIC or more general the GIC criterion are introduced A real data example on a rock-wallaby colony in the Warrambungles National Park (New South Wales Australia) is used for illustration The probability of the presence or absence of rock wallaby and sympatric macropod scats at 200 evenly distributed plots is modeled in dependence of a number of potential predictor variables for either food availability or predator risk Resampling can further reduce the number of interesting candidate models and is able to assess stability of chosen models The presented methodology is not limited to likelihood based model building and can be adapted to any other model selection strategy that consists of (i) a measure of description error and (ii) a penalty for model complexity eg based on the number of parameters in the model

ESTIMATING WEIGHTS FOR CONSTRUCTING COMPOSITE ENVIRONMENTAL INDICES

Ross Darnell1 Bill Venables1 and Simon Barry1

1CSIRO Mathematics Informatics and Statistics

E-mail RossDarnellcsiroau

Composite indicators are very common for benchmarking the progress of countries and regions in a variety of policy domains such as ecosystem health The production of composite indicators are demanded by government policy makers involved in natural resource management Composite indices are averages of different sub-indices and the single value which they produce may conceal divergencies between the individual components or sub-indices possibly hiding useful information In general the weighting problem remains in the realm of subjectivity Our approach requires experts to provide a relative importance ratio eg how important one indicator is to another The indicator weightings themselves are then estimated using a Bradley-Terry model which has the capacity to provide consistency checks between and within experts

48 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A SPATIAL DESIGN FOR MONITORING THE HEALTH OF A LARGE-SCALE FRESHWATER RIVER SYSTEM

Melissa Dobbie1 Charis Burridge1 and Bill Senior2

1CSIRO Australia Mathematics Informatics and Statistics2Queensland Department of Environmental and Resource Management

E-mail melissadobbiecsiroau

River systems are complex monitoring domains Their complexity leads to numerous challenges when it comes to designing an optimal spatial monitoring program for making condition assessments including defining an appropriate sampling frame how to best handle the dynamic nature of the system and taking into account the various fieldwork considerations In this talk we discuss some of these challenges and present a spatial design using the generalised random-tessellation stratified (GRTS) approach developed by Stevens and Olsen (2004 JASA 99 262-278) that allocates sparse sampling resources across space to maximise information available and to ensure reliable credible and meaningful inferences can be made about the health of a particular Queensland river system

BACKFITTING ESTIMATION OF A RESPONSE SURFACE MODEL

Jhoanne Marsh C Gatpatan1 and Erniel B Barrios2

1University of the Philippines Visayas2University of the Philippines Diliman

E-mail cyann_marsyahoocom

The backfitting algorithm is used in estimating a response surface model with covariates from a data generated through a central composite design Backfitting takes advantage of the orthogonality generated by the central composite design on the design matrix The simulation study shows that backfitting yield estimates and predictive ability of the model comparable to that from ordinary least squares when the response surface bias is minimal Ordinary least squares estimates generally fails when the response surface bias is large while backfitting exhibits robustness and still produces reasonable estimates and predictive ability of the fitted model Orthogonality facilitates the viability of assumptions in an additive model where backfitting is an optimal estimation algorithm

Keywords backfitting response surface model second order model central composite design

49The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

TUESDAY 1ST DEC

900 Keynote Address (Swifts) Martin Bland University of YorkChair Simon Day

CLUSTERING BY TREATMENT PROVIDER IN RANDOMISED TRIALS

J Martin Bland11Department of Health Sciences Univeristy of York

E-mail mb55yorkacuk

Methods for the design and analysis of trials where participants are allocated to treatment in clusters are now well established Clustering also happens in trials where participants are allocated individually when the intervention is provided by individual operators such as surgeons or therapists These operators form a hidden sample whose effect is usually ignored Recently trial designers have been considering how they should allow for this clustering effect and funders have been asking applicants the same question In this talk I examine some of these issues and suggest one simple method of analysis

50 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

TUESDAY 1ST DECSession 1 Swifts Missing DataChair Vanessa Cave

THE FUTURE OF MISSING DATA

Herbert Thijs11I-Biostat Hasselt University Agoralaan 1 Building D 3590 Diepenbeek

Belgium

E-mail herbertthijsuhasseltbe

In clinical trials related to pain treatment patients are often followed for a specific period over time yielding longitudinal pain data In this type of clinical trials the researcher is almost always confronted with the problem of incomplete data Commonly used methods to analyze data subject to incompleteness are Complete Case Analysis (CC) Last Observation Carried forward (LOCF) Baseline Observation Carried forward (BOCF) Worst Observation Carried forward (WOCF) and some of them are used for a very long time while others were more recently developed in order to cope with increasing criticism Within the scope of clinical trials on pain the defense of above mentioned methods exists in the fact that tolerability of pain drugs often is an issue which needs to ne taken into account and authorities claim that patients who cannot tolerate the drug should be treated as non-responders Starting from this vision it is then argued that one should impute the pre randomization measurement for those patients leading to BOCF

From a statistical point of view however this cannot be tolerated It can be shown easily that BOCF as well as other carried forward methods violates the information in the data tremendously and lead to biased results which can be both conservative and liberal In this manuscript instead of contributing to the further breakdown of the carried forward family we state NO Carrying Forward (NOCF) as the strategy to deal with missing data in the future More precisely while most researchers feel that different dropout rates in different treatment arms indicates the missing data mechanism to be MNAR we will briefly show that this actually is consistent with MCAR Furthermore we will provide a method to deal with tolerability issues without the use of BOCF by combining both efficacy and tolerability data in a direct likelihood approach

51The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

APPLICATION OF LATENT CLASS WITH RANDOM EFFECTS MODELS TO LONGITUDINAL DATA

Ken Beath11Macquarie University Australia

E-mail kbeathsciencemqeduau

Standard latent class methodology has been commonly used to classify subjects based on longitudinal binary data This ignores the possibility of heterogeneity within classes which will result in identification of extraneous classes An improved method is to incorporate random effects into the latent class model This is applied to data on bladder control in children demonstrating the over extraction of classes using standard latent class methods A difficulty with this method is the assumption of normality of the random effect This may be improved by assuming that each class is a mixture

950 - 1030

TUESDAY 1ST DEC Session 1 Boardroom Count DataChair Hwan-Jin Yoon

A STRATEGY FOR MODELLING COUNT DATA WHICH MAY HAVE EXTRA ZEROS

Alan Welsh11The Australian Naional University

E-mail AlanWelshanueduau

We will discuss my most recent thoughts on how to approach modelling count data which may contain extra-zeros We will work through an example from the first step of fitting a simple Poisson regression model to ultimately obtaining an adequate model accommodating both possible extra-zeros and possible overdispersion We will illustrate the advantages of separating the effects of overdispersion and extra-zeros and show how to use diagnostics to deal successfully with these issues

52 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A RELIABLE CONSTRAINED METHOD FOR IDENTITY LINK POISSON REGRESSION

Ian Marschner11Macquarie University

E-mail ianmarschnermqeduau

An identity link Poisson regression model is more appropriate than the usual log link model if the mean of a count variable is related additively rather than multiplicatively to a collection of predictor variables Such models have a range of applications but are particularly important in epidemiology where they can be used to model absolute differences in disease incidence rates as a function of covariates A well known complexity of identity link Poisson models is that standard computational methods for maximum likelihood estimation can be numerically unstable due to the non-negativity constraints on the Poisson means I will present a straightforward and flexible method based on the EM algorithm which provides reliable and flexible maximisation of the likelihood function over the constrained parameter space The method adapts and extends approaches that have been used successfully in specialised applications involving Poisson deconvolution problems Such methods have not been used previously in conventional regression modelling because they restrict the regression coefficients to be non-negative rather than the fitted means I will show how to overcome this using a sequence of constrained maximisations over subsets of the parameter space after which the global constrained maximum is identified from among the subset maxima Both categorical factors and continuous covariates can be accommodated the latter having either a linear form or a completely unspecified isotonic form The method is particularly useful with resampling methods such as the bootstrap which may require reliable convergence for thousands of implementations The method will be illustrated using data on occupational mortality in an epidemiological cohort and on a biological data set of crab population counts

53The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Swifts MedicalChair Hans Hockey

MULTIVARIATE RESPONSE MODELS FOR GLOBAL HEALTH-RELATED QUALITY OF LIFE

Annette Kifley1 Gillian Heller1 Ken Beath1 Val Gebski2 Jun Ma1 and David Bulger1

1Macquarie University Australia2NHMRC Clinical Trials Centre University of Sydney Australia

E-mail annettekifleystudentsmqeduau

Clinical studies often use participant-reported quality of life (QOL) outcomes when comparing treatment effects These assessments usually involve multiple QOL questionnaires each containing a mix of items about diverse specific and global aspects of QOL Quality of life itself is regarded as an unobserved underlying construct

Our objective is to summarise this information about overall health-related QOL in a way that facilitates comparisons between treatments in clinical studies Common approaches include selecting from or averaging the one or two direct global item measures obtained or calculating a summary score from the subdimensional item measures of a QOL questionnaire An alternative is to model a theoretical framework of subdomains of QOL and their relationship with overall QOL The first two approaches do not take advantage of all the information collected while the third assumes that questions of interest fall into a relatively small number of theoretical domains which may not always be the case

We develop and compare multivariate response models for global health-related QOL within a generalised latent variable modelling framework using data from two clinical studies in cancer patients This methodology utilises all the available data accommodates the common problem of missing item responses obviates the need for precalculated or selected summary scores and can capture underlying correlations and dimensions in the data

We consider irregular multilevel models that use a mixture of random and non-random cluster types to accommodate direct and indirect QOL measures Models that delineate QOL scales will be compared with those that delineate QOL domains and the contribution of different variance components will be assessed Since the data comprises a mix of non-normal continuous response measures and ordinal response measures distributional issues will also be considered

54 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ESTIMATION OF OPTIMAL DYNAMIC TREATMENT REGIMES FROM LONGITUDINAL OBSERVATIONAL DATA

Liliana Orellana1 Andrea Rotnitzky23 and James Robins3

1Instituto de Calculo Universidad de Buenos Aires Argentina2Universidad T di Tella Buenos Aires Argentina

3Harvard School of Public Health Boston USA

E-mail lorellanaicfcenubaar

Dynamic treatment regimes also called adaptive strategies are individually tailored treatments based on patient covariate history Optimal dynamic regimes (ODR) are rules that will lead to the highest expected value of some utility function at the end of a time period Many pressing public health questions are concerned with finding the ODR out of a small set of rules in which the decision maker can only use a subset of the observed covariates For example one pressing question in AIDS research is to define the optimal threshold CD4 cell count at which to start prescribing HAART to HIV infected subjects a rule which only depends on the covariate history through the minimum CD4 count

We propose methods to estimate the ODR when the set of enforceable regimes comprises simple rules based on a subset of past information and is indexed by a Euclidean vector x The substantive goal is to estimate the regime gxopt

that maximizes the expected counterfactual utility over all enforceable regimes We

conduct inference when the expected utility is a ssumed to follow

models that allow the possibility of borrowing information across regimes and across baseline covariates We consider parametric and semiparametric models on x and on a set of baseline covariates indexed by an unknown Euclidean parameter b0 Under these models the optimal treatment gxopt

is a function of b0 so efficient estimation of the optimal regime depends on the efficient estimation of b0 We derive a class of consistent and asymptotically normal estimators of b0 under the proposed models and derive a locally efficient estimator in the class

The approach is applied to a cohort of HIV positive patients to illustrate estimation of the optimal CD4 count level to start HAART

55The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PARAMETRIC CONDITIONAL FRAILTY MODELS FOR RECURRENT CARDIOVASCULAR EVENTS IN THE LIPID

STUDY

Jisheng Cui1 Andrew Forbes2 Adrienne Kirby3 Ian Marschner4 John Simes3 Malcolm West5 Andrew Tonkin2

1Faculty of Health Medicine Nursing and Behavioural Sciences Deakin

University Melbourne Australia2Department of Epidemiology and Preventive Medicine Monash University

Melbourne Australia3NHMRC Clinical Trials Centre University of Sydney Sydney Australia

4Department of Statistics Macquarie University Sydney Australia 5Department of Medicine University of Queensland Brisbane Australia

E-mail jishengcuideakineduau

Analysis of recurrent event data is frequently needed in clinical and epidemiological studies An important issue in such analysis is how to account for the dependence of the events in an individual and any unobserved heterogeneity of the event propensity across individuals We applied a number of conditional frailty and nonfrailty models in an analysis involving recurrent myocardial infarction events in the Long-Term Intervention with Pravastatin in Ischaemic Disease study A multiple variable risk prediction model was developed for both males and females A Weibull model with a gamma frailty term fitted the data better than other frailty models for each gender Among nonfrailty models the stratified survival model fitted the data best for each gender The relative risk estimated by the elapsed time model was close to that estimated by the gap time model We found that a cholesterol-lowering drug pravastatin (the intervention being tested in the trial) had significant protective effect against the occurrence of myocardial infarction in men (HR = 071 95 CI 060ndash083) However the treatment effect was not significant in women due to smaller sample size (HR = 075 95 CI 051ndash110) There were no significant interactions between the treatment effect and each recurrent MI event (p = 024 for men and p = 055 for women) The risk of developing an MI event for a male who had an MI event during follow-up was about 34 (95 CI 26ndash44) times the risk compared with those who did not have an MI event The corresponding relative risk for a female was about 78 (95 CI 44ndash136) The number of female patients was relatively small compared with their male counterparts which may result in low statistical power to find real differences in the effect of treatment and other potential risk factors The conditional frailty model suggested that after accounting for all the risk factors in the model there was still unmeasured heterogeneity of the risk for myocardial infarction indicating the effect of subject-specific risk factors These risk prediction models can be used to classify cardiovascular disease patients into different risk categories and may be useful for the most effective targeting of preventive therapies for cardiovascular disease

56 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1130 - 1230

TUESDAY 1ST DEC Session 2 Boardroom ModellingChair Olena Kravchuk

BUILDING A MORE STABLE PREDICTIVE LOGISTIC REGRESSION MODEL

Anna Campain1 Samuel Mueller2 and Jean Yang1

1School of Mathematics and Statistics Sydney Bioinformatics Centre for

Mathematical Biology University of Sydney F07 Sydney NSW 20062School of Mathematics and Statistics University of Sydney F07 Sydney NSW

2006

E-mail annacmathsusydeduau

Logistic regression is a common method used to construct predictive models in a binary class paradigm However missing data imbalance in class distribution and the construction of unstable models are common problems when analysing data Each of these three challenges is important and should be addressed so that statistically viable inferences are obtained Motivated by the analysis of an early pregnancy clinical study with missing covariates and highly imbalanced classes we have established a method for the development of a stable logistic regression model Our final developed method is stable enjoys good independent set predicative qualities and is in that sense superior compared to more basic procedures including case deletion single imputation with and without weights and multiple imputation without additional model reduction

We further investigated in more detail to what extent missingness introduced bias We were particularly interested in the question How much missingness is too much to produce reliable statistical inference The early pregnancy clinical study was used as a case study to examine how model parameters are influenced by the presence of missing data We borrow a notion from the concept of bioequivalence to determine the maximum amount of missingness that can be present while still producing similar parameter estimates after imputation as those found when data was fully observed It will be shown that the amount of missingness present in the data set and the nature of the variable in question affects the parameter estimates and their respective distributions We believe this to be an important aspect of statistical model building when multiple imputation is applied

57The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

STEPWISE PARING DOWN VARIATION FOR IDENTIFYING INFLUENTIAL MULTIFACTOR INTERACTIONS

Jing-Shiang Hwang1 and Tsuey-Hwa Hu1

1Academia Sinica Taiwan

E-mail hwangsinicaedutw

We consider the problem of identifying influential sets of factors related to a continuous response variable from a large number of factor variables This has become an increasingly important problem as more scientists facilitate techniques to produce high dimensional data to unveil hidden information Although several model based methods are promising for the identification of influential marker sets in some real applications each method has its own advantages and limitations The ability of the methods to reveal more true factors and fewer false ones often relies heavily on tuning parameters which is still a difficult task This article provides a completely different solution with a simple and novel idea for the identification of influential sets of variables The method is simple as it involves only repeatedly implementing single-term analysis of variation The main idea is to stepwise pare down the total variation of responses so that the remaining influential sets of factors have increased chances of being identified We have developed an R package and demonstrate its competitive performance compared to several methods available in R packages including the popular group lasso logic regression and Bayesian QTL mapping methods through simulation studies A real data example shows additional interesting findings that result from using the proposed algorithm We also suggest ways to reduce the computational burden when the number of factors is very large say thousands

58 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EMPIRICAL LIKELIHOOD ESTIMATION OF A DIAGNOSTIC TEST LIKELIHOOD RATIO

David Matthews11Department of Statistics amp Actuarial Science University of Waterloo

E-mail demattheuwaterlooca

Let p1 and p2 represent the individual probabilities of response to a particular diagnostic test in two subpopulations consisting of diseased and disease-free individuals respectively In the terminology of diagnostic testing p1 is called the sensitivity of the given test and p2 is the probability of a false positive error ie the complement of 1 ndash p2 which is the test specificity Since 1975 the ratios r+ = p1 p2 and r- = (1 ndash p1) (1 ndash p2) have been of particular interest to advocates of evidence-based medicine For diagnostic tests producing results on a continuous measurement scale the limiting analogue of these binary classifications of the test result is the ratio rx = f1(x) f2(x) of the respective test measurement density functions We describe an empirical likelihood-based method of estimating rx and illustrate its application to test outcome results for the CA 9-19 cancer antigen in 90 patients with confirmed pancreatic cancer and 51 subjects with pancreatitis

59The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

WEDNESDAY 2ND DEC

900 Keynote Address (Swifts) Thomas Lumley University of WashingtonChair Alan Welsh

USING THE WHOLE COHORT IN ANALYSIS OF SUBSAMPLED DATA

Thomas Lumley11Department of Biostatistics University of Washington

E-mail tlumleyuwashingtonedu

Large cohort studies in epidemiology typically collect hundreds or thousands of variables over time on thousands of participants Hospital and insurance databases have fewer variables but on many more people In either setting it is common for researchers to wish to measure a new variable on only a subsample of the available people to reduce costs Classically this has been either a simple random sample for a ldquovalidation studyrdquo or a sample stratified on a health outcome for a ldquocase-control studyrdquo It is now well established that stratifying the sampling on exposure variables in addition to outcome can lead to a more efficient design More recent research has focused on ways of using information from people not included in the subsample I will describe how the combination of ideas from semiparametric modelling and survey statistics gives a straightforward way to use information from the whole cohort in analysing subsamples and discuss theoretical and practical limits on how much information can be extracted

60 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DEC Session 1 Swifts Clinical TrialsChair Ian Marschner

ADJUSTING FOR NONRESPONSE IN CASE-CONTROL STUDIES

Alastair Scott1 Chris Wild1 and Yannan Jiang2

1Department of Statistics University of Auckland2Clinical Trials Research Unit University of Auckland

E-mail ascottaucklandacnz

Arbogast et al (2002 Biometrical J 44 227-239) investigated the use of weighted methods originally developed in survey sampling with weights based on inverse selection probabilities to handle non-response in population-based case-control studies We develop the properties of a broader class of estimating equations that includes survey-weighting along with a number of other more efficient methods We also look at a fully efficient semi-parametric approach We compare the efficiency of all the methods in simulations based on the Womenrsquos Cardiovascular Health Study (Schwartz et al 1977 Ann Internal Medicine 127596-603) the same setting that was used by Arbogast et al for their simulations

61The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CORRECTING FOR MEASUREMENT ERROR IN REPORTING OF EPISODICALLY CONSUMED FOODS WHEN ESTIMATING DIET-DISEASE ASSOCIATIONS

Victor Kipnis1 Raymond Carroll2 Laurence Freedman3 and Douglas Midthune1

1Biometry USA National Cancer Institute2Texas A and M University

3Gertner Institute for Epidemiology and Policy Research Israel

E-mail kipnisvmailnihgov

A food frequency questionnaire (FFQ) has been the instrument of choice to assess usual dietary intake in most large prospective studies of diet and disease It is now well appreciated that FFQs involve substantial measurement error both random and systematic leading to distorted effects of diet and incorrect statistical tests of diet-disease relationships To correct for this error many cohorts include calibration sub-studies in which more precise short term dietary instruments such as multiple 24-hour dietary recalls (24HRs) or food records are administered as reference instruments Methods for correcting for measurement error of FFQ-reported intakes have been developed under the assumption that reference instruments provide unbiased continuous measurements Application of these methods to foods that are not consumed every day (episodically consumed foods) is problematic since short term reference instruments usually include a substantial proportion of subjects with zero intakes violating the assumption that reported intake is continuous We present the recently developed regression calibration approach to correcting for measurement error in FFQ-reported intake of episodically consumed foods using a repeat unbiased short-term reference instrument in a calibration sub-study We exemplify the method by applying it to data from the NIH-AARP Diet and Health Study

62 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

WEDNESDAY 2ND DECSession 1 Boardroom FisheriesChair Charis Burridge

AN EXPLORATORY ANALYSIS OF THE EFFECTS OF SAMPLING IN MARINE SURVEYS FOR BIODIVERSITY

ESTIMATION

Hideyasu Shimadzu1 and Ross Darnell21Geoscience Australia

2CSIRO Mathematics Informatics and Statistics

E-mail HideyasuShimadzugagovau

Scientific trawl and sled surveys are necessary tasks to understand biodiversity in marine science Differences in the collection methods between locations and in the processing of species within a location increase the risk of bias in estimates of biodiversity Repeat measurements under the exactly same conditions at a site are impractical To investigate this issue a simple conceptual model is proposed reflecting commonly used sampling process in marine surveys and an exploratory analysis is undertaken The analysis is based on the observations from Great Barrier Reef Lagoon and highlights that the widely used method called sub-sampling is quite influential on presenceabsence measure of species which is not ignorable any more

ON THE 2008 WORLD FLY FISHING CHAMPIONSHIPS

Thomas Yee11University of Auckland

E-mail tyeeaucklandacnz

The World Fly Fishing Championships (WFFC) held last year in the Taupo-Rotorua regions resulted in over 4000 trout captures by about 100 competitors in 5 waterways over 3 competition days Some interesting results are presented in this talk eg catch reduction effects fish length distributions and testing whether teams strategically targeted smaller sized fish Some broader generic issues and recommendations to future fishing competitions are discussed in light of the analyses eg modifying the point system to give greater reward to bigger fish

63The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DECSession 2 Swifts Medical ModelsChair Katrina Poppe

RELATIVE RISK ESTIMATION IN RANDOMISED CONTROLLED TRIALS A COMPARISON OF METHODS

FOR INDEPENDENT OBSERVATIONS

Lisa Yelland1 Amy Salter1 and Philip Ryan1

1The University of Adelaide

E-mail lisayellandadelaideeduau

The relative risk is a clinically important measure of the effect of treatment on a binary outcome in randomised controlled trials (RCTs) An adjusted relative risk can be estimated using log binomial regression however convergence problems are common with this model Alternative methods have been proposed for estimating the relative risk such as log Poisson regression with robust variance estimation (Zou 2004 Am J Epi 159 702-706) Comparisons between methods have been limited however particularly in the context of RCTs We compare ten different methods for estimating the relative risk under a variety of scenarios relevant to RCTs with independent observations Results of an extensive simulation study suggest that when adjustment is made for a binary andor continuous baseline covariate some methods may fail to overcome the convergence problems of log binomial regression while others may substantially overestimate the treatment effect or produce inaccurate confidence intervals Some methods were more affected than others by adjusting for an irrelevant covariate or failing to adjust for an important covariate We give recommendations for choosing a method for estimating the relative risk in the context of RCTs with independent observations

64 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

MULTIPLE STAGE PROCEDURES IN COVARIATE-ADJUSTED RESPONSE-ADAPTIVE DESIGNS

Eunsik Park1 and Yuan-chin Chang2

1Chonnam National University2Academia Sinica

E-mail espark02gmailcom

The idea of response adaptive design in clinical trial is to allocate more subjects to the superior treatment during a trial without too much diminishing its statistical significance and efficiency In addition the innovation of genomic related bio-medical research makes the personalized medicine possible which also makes the adjustment of covariates of subjects who join the trial becomes an important issue in a clinical trial

The adaptive design is a longstanding statistical method when the design for a statistical model involves unknown parameters and should be estimated during the course of an experiment Thus the concept of sequential analysis is naturally involved The large sample properties of the estimation under such a problem have been studied can be found in the literature For example Zhang et al (2007 Annals of Statistics 35 1166-82) However the fully sequential setup which requires that both the estimation and design procedures to be executed or adjusted whenever there is a new subject included is not convenient in practice In this paper we study a modified sequential estimation procedure -- the multiple-stage method which requires the estimation and design to be updated at each stage and apply it to compare the treatment effects in clinical trials

This procedure can be repeated more than twice so it maintains the advantage of fully sequential method to some level and is more convenient in practical operation Here we study the three-stage procedure based on a logistic regression model which is very popular in evaluating treatment effects when binary responses are observed The numerical study of synthesized data is also presented

Traditionally we use a response-adaptive (RA) design by assuming there is no treatment-covariate interaction effect where the slopes of two treatments are equal However if there is an interaction between treatment and covariate that is another reason to consider covariate-adjusted response-adaptive (CARA) design besides the ethical reason In this case the RA design cannot detect the differences of treatment effects since it is used under the assumption of no interaction between covariates and treatments Furthermore the method with RA design will make incorrect treatment allocation That is it can be correct only in one side of population but completely wrong in the other side Thus in this case the CARA design should perform better than the RA design

65The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

In this work we also compare sequential analysis in response adaptive designs with and without covariate-adjusted and the numerical study of synthesized data is presented

POTENTIAL OUTCOMES AND PROPENSITY SCORE METHODS FOR HOSPITAL PERFORMANCE

COMPARISONS

Patrick Graham11University of Otago Christchurch

E-mail patrickgrahamotagoacnz

Inter-hospital comparisons of patient outcomes are an important initial step in improving hospital performance because identification of differences in outcome rates raises the possibility that the performance of poorer performing hospitals could be improved by adopting treatment practices of better performing hospitals A fundamental issue in making valid hospital performance comparisons is the adequacy of adjustment for case-mix variation between hospitals In this paper I formulate hospital performance comparisons within the framework of potential outcomes models This highlights the ignorability assumption required for valid comparisons to be made given observed case-mix variables Since the number of confounding case-mix variables to be controlled is generally large implementation of conventional modelling approaches can be impractical The potential outcomes framework leads naturally to consideration of propensity score methods which simplify the modelling task by collapsing multiple confounders into a single variable to be controlled in analyses However in analyses involving a multiple category exposure such as hospital in a multiple-hospital study multiple propensity scores must be constructed and controlled and this creates another set of logical and practical modelling problems Some of these issues are resolved using the approach of Huang et al (2005 Health Services Research 40 253-278) which involves stratification on propensity scores and subsequent standardisation to produce hospital-specific standardised risks which are then subject to a smoothing procedure In this paper I extend this approach in two-ways Firstly by adapting the propensity score methodology to the joint modelling of multiple outcomes and secondly by formalising the smoothing of standardised rates within a hierarchical Bayesian framework The approach and associated methodological issues are discussed in the context of a comparison of 30 day mortality rates for patients admitted to New Zealand public hospitals with acute myocardial infarction stroke or pneumonia

66 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

LOCAL ODDS RATIO ESTIMATION FOR MULTIPLE RESPONSE CONTINGENCY TABLES

Ivy Liu1 and Thomas Suesse2

1Victoria University2Centre for Statistical and Survey Methodology University of Wollongong

Australia

E-mail iliumsorvuwacnz

For a two-way contingency table with categorical variables local odds ratios are common to describe the relationships between the row and column variables If a study attempts to control other factors that might influence the relationships a three-way contingency table can show the associations between the group and response variables controlling for a possibly confounding variable An ordinary case has mutually exclusive cell counts ie all subjects must fit into one and only one cell However many surveys have a situation that respondents may select more than one outcome category The observations can fall in more than one category in the table This talk discusses both maximum likelihood and Mantel-Haenszel methods to estimate the local odds ratios in such cases We derive new dually consistent (co)variance estimators and show their performance with a simulation study

67The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 - 1220

WEDNESDAY 2ND DEC Session 2 Boardroom AgricultureHorticultureChair Emlyn Williams

HABITAT USAGE OF EXTENSIVELY FARMED RED DEER HINDS IN RESPONSE TO ENVIRONMENTAL FACTORS

OVER CALVING AND LACTATION

Roger Littlejohn1 and Geoff Asher1

1AgResearch Invermay Agricultural Centre

E-mail rogerlittlejohnagresearchconz

Global Positioning Systems (GPS) technology was used to determine the positions of farmed red deer hinds during the calving and lactation on an extensively managed high-country station Meteorological data was collected from a nearby weather station We describe an analysis of the data relating hind behaviour (half-hourly distance travelled altitude habitat occupancy) to environmental factors Hinds showed strong individualisation in their core occupancy areas with collared hinds occupying disjoint areas with different aspects and within variable vegetation zones Heavier hinds selected lower flatter zones of naturalised grass while smaller hinds tended to select higher altitudinal zones dominated by tussock During the pre-calvingparturition period there was no evidence of any influence of weather variables on behaviour indicating that reproductive behaviours described by a simple hidden Markov model took precedence over general behavioural patterns During the subsequent lactation period there was clear evidence of diurnal patterns of distances travelled and altitudinal occupation that were moderately influenced by weather variables with associations between altitude and wind speed and between distance travelled and solar irradiation and temperature

SOME STATISTICAL APPROACHES IN ESTIMATING LAMBING RATES

Mario DrsquoAntuono1 and Peter Clarke1

1Dept of Agriculture and Food Western Australia

E-mail mdantuonoagricwagovau

In this paper I discuss some statistical approaches in the estimation of lambing rates and the lsquoseeminglyrsquo lack of standard errors in many research papers in animal science in Australia and New Zealand

68 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

FTIR ANALYSIS ASSOCIATIONS WITH INDUCTION AND RELEASE OF KIWIFRUIT BUDS FROM DORMANCY

Denny Meyer1 Murray Judd2 John Meekings3 Annette Richardson3 and Eric Walton4

1Swinburne University of Technology2Seeka Kiwifruit Industries

3The New Zealand Institute for Plant and Food Research Ltd4University of Otago

E-mail dmeyerswineduau

Many deciduous perennial fruit crops require winter chilling for adequate budbreak and flowering Recent research has shown that changes in sugar and amino acid profiles are associated with the release of buds from dormancy This paper uses FTIR spectrometry to provide an alternative mechanism for tracking metabolic changes in the meristems of kiwifruit buds (actinidia deliciosa (A Chev) CF Liang et AR Ferguson var Deliciosa ldquohaywardrdquo) during winter dormancy Ten wave numbers of the FTIR spectra are used to calculate a bud development function This function has been validated using data from two seasons and four orchards and by monitoring the effects of hydrogen cyanamide application sugar concentrations and soil temperatures on this function It is expected that this FTIR signature can be used to advance our understanding of the influence of the various environmental and physiological factors on the breaking of bud dormancy and shoot outgrowth including the optimum timing and concentrations of applications of budbreak regulators such as hydrogen cyanamide

69The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

NON-LINEAR MIXED-EFFECTS MODELLING FOR A SOIL TEMPERATURE STUDY

Pauline Ding11Australian National University

E-mail Paulinedinganueduau

There is a growing interest in using coarse woody debris (CWD) as a restoration treatment to introduce missing structural complexity to modified ecosystems CWD is suggested as a source of nutrients and refugia for flora and fauna however little is known about how CWD influences the microenvironments it creates

Logs are the key constituent of CWD and soil temperature around the logs is one of the indicators to quantify the effects of CWD in modified woodlands In the study a log was selected at random within the Goorooyarroo Nature Reserve in the Australian Capital Territory Electronic recording devices were buried at various positions to record soil temperature changes during an experimental period of eight days The treatments of interest were the Ground Cover Types (covered uncovered) Distance from the log (0cm 10cm 20cm 40cm 80cm) Depth (1cm 5cm) Two non-linear mixed models were used to study the different treatment effects

1330 WEDNESDAY 2ND DECInvited Speaker (Swifts)Alison Smith NSW Department of Industry and InvestmentChair David Baird

EMBEDDED PARTIALLY REPLICATED DESIGNS FOR GRAIN QUALITY TESTING

Alison B Smith1 Robin Thompson2 and Brian R Cullis1 1Wagga Wagga Agricultural Institute Australia

2Rothamsted Research Harpenden UK

E-mail alisonsmithindustrynswgovau

The literature on the design and analysis of cereal variety trials has focussed on the trait of grain yield Such trials are also used to obtain information on grain quality traits but these are rarely subjected to the same level of statistical rigour The data are often obtained using composite rather than individual replicate samples This precludes the use of an efficient statistical analysis In this paper we propose an approach in which a proportion of varieties is grain quality tested using individual replicate samples This is achieved by embedding a partially replicated design (for measuring quality traits) within a fully replicated design (for measuring yield) This allows application of efficient mixed model analyses for both grain yield and grain quality traits

70 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1410 - 1510

WEDNESDAY 2ND DEC Session 3 Swifts DesignChair Ross Darnell

SPATIAL MODELS FOR PLANT BREEDING TRIALS

Emlyn Williams11Statistical Consulting Unit ANU

E-mail emlynwilliamsanueduau

Most plant breeding trials involve a layout of plots in rows and columns Resolvable row-column designs have proven effective in obtaining efficient estimates of treatment effects Further improvement may be possible by postblocking or by inclusion of spatial model components Options are investigated for augmenting a baseline row-column model by the addition of spatial components The models considered include different variants of the linear variance model in one and two dimensions Usefulness of these options is assessed by presenting results from a number of field variety trials

A TWO-PHASE DESIGN FOR A HIGH-THROUGHPUT PROTEOMICS EXPERIMENT

Kevin Chang1 and Kathy Ruggiero1

1University of Auckland

E-mail kcha193aucklanduniacnz

A leading complication cardiac myopathy (weakening of the heart muscle) is the thickening of the muscle layers in the heartrsquos left ventricle (LV) wall The molecular pathologies of the diabetic-mediated changes across this wall have been difficult to explore However recent advances in high throughput technologies for proteomic profiling (ie protein identification and quantification) for example are providing a way forward

We describe the design of a two-phase experiment to compare the proteomes of the inner and outer LV wall of control and diabetic rat hearts The first phase involves the randomisation of animals to treatments and provides the biological material to be used in the subsequent second phase laboratory-based experiment The second phase involves Multi-dimensional Protein Identification Technology (MudPIT) coupled with isobaric Tags for Relative and Absolute Quantitation (iTRAQ) to measure protein abundances

71The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

SHRINKING SEA-URCHINS IN A HIGH CO2 WORLD A TWO-PHASE EXPERIMENTAL DESIGN

Kathy Ruggiero1 and Richard Jarrett2

1School of Biological Sciences The University of Auckland New Zealand2CSIRO Mathematics Informatics and Statistics Melbourne Australia

E-mail kruggieroaucklandacnz

As atmospheric CO2 rises due to human activity so too does its absorption by the earthrsquos oceans Evidence is mounting that this phenomenon of ocean acidification decreases the ability of many marine organisms to build their shells and skeletons A boutique microarray was developed specifically to elucidate the molecular mechanisms at play in the changes in skeletal morphology of sea urchins with increasing seawater acidity

cDNA samples were derived from sea-urchin larvae subjected to increasing seawater acidity and heat shock temperatures in a Phase 1 experiment We will describe the important design considerations when assigning pairs of these samples to microarrays in the Phase 2 experiment While alternative designs are possible at Phase 2 we will show why one design is preferred over another We will also consider how the data from this design can be analysed using the LIMMA package in R

1410 - 1510

WEDNESDAY 2ND DECSession 3 Boardroom Functional AnalysisChair Marcus Davy

CAN FUNCTIONAL DATA ANALYSIS BE USED TO DEVELOP A NEW MEASURE OF GLOBAL CARDIAC

FUNCTION

Katrina Poppe1 Gillian Whalley1 Rob Doughty1 and Chris Triggs1

1The University of Auckland

E-mail kpoppeaucklandacnz

Cardiac motion is a continuum that depends on the inter-relationships between the contraction and relaxation phases of the cardiac cycle (heart beat) However measurements that assess cardiac function are traditionally taken at only brief moments during that process and assess contraction separately to relaxation

Three-dimensional ultrasound images of the heart allow volume in the left ventricle (LV) to be measured from each frame of an imaging sequence Using functional data analysis the repeated measurements of volume through the cardiac cycle can be converted to a function of time and derivatives taken Plotting volume against the first and second derivatives evolves a closed loop in three dimensions After finding the projection that maximises the area within the loop we can compare the areas during contraction and relaxation and so develop a new measure of global cardiac function

72 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

VARIABLE PENALTY DYNAMIC WARPING FOR ALIGNING GC-MS DATA

David Clifford1 and Glenn Stone1

1CSIRO

E-mail davidcliffordcsiroau

Gas Chromatography Mass Spectrometry (GC-MS) is a technology used in environmental monitoring criminal forensics security foodbeverageperfume analysis astrochemistry and medicine It is considered the ldquogold standardrdquo for detecting and identifying trace quantities of compounds in test substances The data collected by GC-MS is high dimensional and large as the technology divides the substance into and quantifies the amount of each compound that make up the test substance Typically the first step involved in an analysis of data like this is the alignment of the data to correct for the often subtle drift of the gas chromatography part of the system that can occur over time Once properly aligned these high-dimensional data is used to find compounds that distinguish between test substances - eg different kinds of meat wine of different quality blood serum from healthynon-healthy individuals etc

In this talk I highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals We are interested in sets of signals that can be aligned well locally but not globally by shifting individual signals in time This kind of alignment is often sufficient for aligning gas chromatography data Regular DTW often ldquoover-warpsrdquo signals and introduces artificial features into the aligned data To overcome this one can introduce a variable penalty into the DTW process The penalty is added to the distance metric whenever a nondiagonal step is taken I will discuss penalty selection and showcase the method using three examples from agricultural and health research The use of variable penalty DTW significantly reduces the number of nondiagonal moves In the examples presented here this reduction is by a factor of 30 with no cost to visual quality of the alignment

Clifford et al (Anal Chem 2009 81 (3) pp 1000-1007)

73The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODEL FOR THE ENZYMATICALLY 18O-LABELED MALDI-TOF MASS SPECTRA

Tomasz Burzykowski1 Qi Zhu1 and Dirk Valkenborg2

1I-BioStat Hasselt University Belgium2Flemish Institute for Technological Research Belgium

E-mail tomaszburzykowskiuhasseltbe

To reduce the influence of the between-spectra variability on the results of peptide quantification the 18O-labeling approach can be considered The idea is similar to eg two-channel cDNA microarrays Peptides from two biological samples are analyzed in the same spectrum To distinguish between the two samples peptides from one sample are labeled with a stable isotope of oxygen 18O As a result a mass shift of 4 Da of the peaks corresponding to the isotopic distributions of peptides from the labeled sample is induced which allows to distinguish them from the peaks of the peptides from the unlabeled sample and consequently to quantify the relative abundance of the peptides

However due to the presence of small quantities of 16O and 17O atoms during the labeling step the labeled peptide may get various oxygen isotopes As a result not all molecules of the labeled peptide will be shifted by 4 Da in the spectrum This incomplete labeling may result in the biased estimation of the relative abundance of the peptide

To address this issue Valkenborg et al submitted developed a Markov model which allows one to adjust the analysis of the 18O -labeled spectra for incomplete labeling The model assumed that the peak intensities observed in a spectrum were normally distributed with a constant variance This assumption is most likely too restrictive from a practical point of view

In this paper we describe various extensions of the model proposed by Valkenborg et al For instance we allow for a heteroscedastic normal error We also consider the possibility of inclusion of random effects into the mode by proposing a Bayesian formulation of the model We investigate the operational characteristics of the various forms of the model by applying them to real-life mass-spectrometry datasets and by conducting simulation studies

74 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Swifts MethodsChair David Clifford

HIGH-DIMENSIONAL MULTIPLE HYPOTHESIS TESTING WITH DEPENDENCE

Sandy Clarke1 and Peter Hall11University of Melbourne

E-mail sjclarkeunimelbeduau

Multiple hypothesis testing is a research area that has grown considerably in recent years as the amount of data available to statisticians grows from a variety of applications High-dimensional contexts have their own challenges in particular developing testing procedures which detect true effects powerfully whilst keeping the rate of false positives (such as the false discovery rate or FDR) low

In these contexts the assumption of independence between test statistics is commonly made although this is rarely true This talk will discuss the ramifications of this assumption in the context of a linear process There are many situations where the effects of dependence are minimal which is assuring However there may also be instances when this is not the case which will be discussed in more detail

Many methods to correct for dependence involve correlation estimates which are computationally difficult (and even inaccurate) in the high-dimensional context Others provide overly conservative adjustments and consequently result in a loss of statistical power Ideally understanding the effects of dependence on quantities like FWER or FDR should enable us to improve the power of our procedures to control these quantities

As well as summarising some of the existing results in this area this presentation will consider some of the expected consequences of dependence in cases where it ought not to be ignored with the aim of developing methods to adjust for it

75The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

METROPOLIS-HASTINGS ALGORITHMS WITH ADAPTIVE PROPOSALS

Renate Meyer1 Bo Cai2 and Francois Perron3

1University of Auckland New Zealand2University of South Carolina USA

3University of Montreal Canada

E-mail meyerstataucklandacnz

Different strategies have been proposed to improve mixing and convergence properties of Markov Chain Monte Carlo algorithms These are mainly concerned with customizing the proposal density in the Metropolis-Hastings algorithm to the specific target density and require a detailed exploratory analysis of the stationary distribution andor some preliminary experiments to determine an efficient proposal Various Metropolis-Hastings algorithms have been suggested that make use of previously sampled states in defining an adaptive proposal density Here we propose a general class of adaptive Metropolis-Hastings algorithms based on Metropolis-Hastings-within-Gibbs sampling For the case of a one-dimensional target distribution we present two novel algorithms using mixtures of triangular and trapezoidal densities These can also be seen as improved versions of the all-purpose adaptive rejection Metropolis sampling (ARMS) algorithm to sample from non-logconcave univariate densities Using various different examples we demonstrate their properties and efficiencies and point out their advantages over ARMS and other adaptive alternatives such as the Normal Kernel Coupler

76 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

BAYESIAN INFERENCE FOR MULTINOMIAL PROBABILITIES WITH NON-UNIQUE CELL

CLASSIFICATION AND SPARSE DATA

Nokuthaba Sibanda11Victoria University of Wellington

E-mail nsibandamsorvuwacnz

The problem of non-unique cell classification in categorical data arises when the cell that an observation falls in cannot be uniquely identified This problem is further compounded when data for some of the categories is sparse Two approaches for Bayesian estimation of multinomial cell probabilities in such circumstances are developed In one approach an exact likelihood is used The second approach uses an augmented data likelihood The performance of the two methods is assessed using a number of prior distributions The methods are demonstrated in the estimation of probabilities of meiotic non-disjunction leading to trisomy 21 Data from one Single Nucleotide Polymorphism (SNP) is used first and improvements in performance are then assessed for additional SNPs As a validation check the estimated probabilities are compared to laboratory estimates obtained using a combination of many SNPs and Short Tandem Repeats

FILTERING IN HIGH DIMENSION DYNAMIC SYSTEMS USING COPULAS

Jonathan Briggs11University of Auckland

E-mail jbri002stataucklandacnz

There is currently no methodology for assimilating moderate or high dimension observations into high dimension spatiotemporal model estimates with general distribution In this talk I will propose a new methodology using copulas to approximate data assimilation in this situation The methodology is simple to implement and provides good results in both a simulation study and a real example

77The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1540 - 1700

WEDNESDAY 2ND DECSession 4 Boardroom Mixtures amp ClassificationChair Thomas Yee

ON ESTIMATION OF NONSINGULAR NORMAL MIXTURE DENSITIES

Michael Stewart11University of Sydney

E-mail mstewartusydeduau

We discuss the generalisation of results on univariate normal mixtures in Magder and Zeger (JASA 1996) and Ghosal and van der Vaart (Ann Stat 2001) to the multivariate case If the mixing distribution has controlled tails then the Hellinger distance between the fitted and true densities converges to zero at an almost parametric rate at any fixed dimension We discuss some potential applications to the study of age distribution in fish populations

ESTIMATION OF FINITE MIXTURES WITH NONPARAMETRIC COMPONENTS

Chew-Seng Chee1 and Yong Wang1

1The University of Auckland

E-mail cheestataucklandacnz

It may sometimes be clear from background knowledge that a population under investigation proportionally consists of a known number of subpopulations whose distributions belong to the same yet unknown family While a parametric assumption can be used for this family one can also estimate it nonparametrically to avoid distributional misspecification Instead of using the standard kernel-based estimation as suggested in some recent research in the literature in this talk we describe a new approach that uses nonparametric mixtures for solving the problem We show that the new approach performs better through simulation studies and some real-world biological data sets

78 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

CLASSIFICATION TECHNIQUES FOR CLASS IMBALANCE DATA

Siva Ganesh1 Nafees Anwar1 and Selvanayagam Ganesalingam1

1Massey University

E-mail sganeshmasseyacnz

Classification is a popular modelling idea in Statistics and Data Mining It is also known as discriminant analysis in the statistical literature and supervised learning in the machine learning literature The main aim is to build a functionrule from the training data and to use the rule to classify new data (with unknown class label) into one of the existing classes or groups

In general classes or training datasets are approximately equally-sized or balanced and the classification techniques assume that the misclassification errors cost equally However data in the real world is sometimes highly imbalanced and very large In two-group classification problems class imbalance occurs when observations belonging to one classgroup (lsquomajorityrsquo class) heavily outnumber the cases in the other class (lsquominorityrsquo class) The traditional classification techniques result in bad performance when they learn from imbalanced training sets Thus classification on imbalanced data has become an important research problem with the main interest being on building models to correctly classify the minority class

In this presentation a brief overview of the approaches found in the literature is given followed by details of some alternatives proposed Two main approaches have been suggested in the literature The first is to lsquobalancersquo the lsquoimbalancersquo in the class distribution via sampling schemes especially over-sampling of minority cases andor under-sampling the majority class The second approach uses cost-sensitive learning mainly with high cost for misclassifying minority cases compared to majority cases Class imbalance involving two classes has been the main focus of studies in the literature The proposed methods are compared with some of the leading existing approaches on real examples and the findings are discussed

79The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

COMPARISON OF THE PERFORMANCE OF QDF WITH THAT OF THE DISCRIMINANT FUNCTION (AEDC) BASED

ON ABSOLUTE DEVIATION FROM THE MEAN

Selvanayagam Ganesalingam1 Siva Ganesh1 and A Nanthakumar1

1Massey University

E-mail sganeshmasseyacnz

The estimation of the error rates is of vital importance in classification problems as this is used as a basis to choose the best discriminant function (ie) the one with a minimum misclassification error

Consider the problem of statistical discrimination involving two multivariate normal distributions with equal means but different covariance matrices Traditionally a quadratic discriminant function (QDF) is used to separate two such populations Ganesalingam and Ganesh (2004) introduced a linear discriminant function called lsquoAbsolute Euclidean Distance Classifier (AEDC)rsquo and compared its performance with that of QDF on simulated data in terms of their associated misclassification error rates In this paper an approximate analytical expressions for the overall error rate associated with the AEDC and QDF are derived and computed for the various covariance structures in a simulation exercise which serve as a bench mark for comparison

The approximation we introduce in this paper simplifies the amount of computations involved Also this approximation provides a closed form expression for the tail areas of most symmetrical distributions that is very useful in many practical situations such as the misclassification error computation in discriminant analysis

Keywords multivariate normal distributions linear discriminant function quadratic discriminant function Euclidean distance classifier contaminated data

80 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

THURSDAY 3RD DEC

900 Keynote Address (Swifts) Chris Triggs University of AucklandChair Ruth Butler

NUTRIGENOMICS ndash A SOURCE OF NEW STATISTICAL CHALLENGES

Christopher M Triggs1 and Lynnette R Ferguson1 1The University of Auckland and Nutrigenomics New Zealand

E-mail cmtriggsaucklandacnz

Nutrigenomics New Zealand was established to develop a capability that could be applied to the development of gene-specific personalised foods The programme includes two Crown Research Institutes (Plant and Food Research and AgResearch) and the University of Auckland involving 55 named scientists spread across several different New Zealand centres Inflammatory Bowel Diseases in particular Crohnrsquos disease are being used as proof of principle There is known genetic susceptibility to Crohnrsquos disease which is impacted by environment and diet We have established case-control studies of more than 1000 IBD patients compared with approximately 600 unaffected controls and are studying these for single nucleotide polymorphisms (SNPs) and gene copy number variants that associate with disease At the same time we are seeking dietary information especially in terms of food tolerances and intolerances By these means we have identified mushroom intolerance in individuals carrying variants in an organic cation transporter gene (OCTN1) These human disease SNPs is incorporated into design paired reporter gene constructs that is otherwise identical cell lines with and without the variant SNP of interest These cell lines are then tested in a high throughput screen to determine the phenotypic effects of nutrients bioactive compounds and food extracts The most effective compounds are then tested in mouse models where the endpoints are transcriptomics proteomics and metabolomics as well as direct observation of pathologic changes in the diseased gut The programme relies on high quality data management and has lead to a range of problems in bioinformatics and biostatistics

AcknowledgementsNutrigenomics New Zealand (wwwnutrigenomicsorgnz) is a collaboration between AgResearch Plant amp Food Research and the University of Auckland and is largely funded by the New Zealand Foundation for Research Science and Technology

81The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

950 - 1030

THURSDAY 3RD DECSession 1 Swifts GeneticsChair Ken Dodds

COMBINATION OF CLINICAL AND GENETIC MARKERS TO IMPROVE CANCER PROGNOSIS

Kim-Anh Le Cao12 Emmanuelle Meugnier3 and Geoffrey McLachlan4

1ARC Centre of Excellence in Bioinformatics Institute for Molecular Bioscience

University of Queensland 4072 St Lucia QLD Australia2Queensland Facility for Advanced Bioinformatics University of Queensland

4072 St Lucia QLD Australia3INRA 1235 INSERM U870 INSA-Lyon Regulations Metaboliques Nutrition et

Diabetes Universite de Lyon Oullins France4Department of Mathematics and Institute for Molecular Bioscience University

of Queensland 4072 St Lucia QLD Australia

E-mail klecaouqeduau

In cancer studies various clinical or pathological factors have been evaluated as prognosis factors (eg tumour size histological grade ) Although these factors provide valuable information about the risk of recurrence they are generally not considered as sufficient to predict individual patient outcomes On the contrary microarray technology is being seen as having a great potential to gain insight into cell biology and biological pathways and has been mostly used to identify candidate genes for cancer prognosis However the clinical application resulting from microarray statistical analyses remains limited The combination of gene expression and clinical factors could be beneficial for cancer prognosis as the latter have very low noise level This is a challenging task as the two types of data are different (discrete vs continuous) and have different characteristics We investigate the integration of clinical data and microarray data to improve the prediction of cancer prognosis using mixture of experts (ME) of Jordan amp Jacobs (1994 Neural Computation 3 79-87) ME is a promising approach based on a divide-and-conquer strategy to tackle complex and non linear problems We further develop the work of Ng amp McLachlan (2008 Machine Learning Research Progress Chap 1 1-14) and include various gene selection procedures in the proposed integrative ME model We show on three cancer studies that the accuracy can be improved when combining both types of variables Furthermore the genes that were selected in the analysis (eg IGFBP5 PGK1 in breast cancer) were found to be of high relevance and can be considered as potential biomarkers for the prognostic selection of cancer therapy in the clinic

82 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

EFFECTIVE POPULATION SIZE ESTIMATION USING LINKAGE DISEQUILIBRIUM AND DIFFUSION

APPROXIMATION

Jing Liu11Department of Statistics University of Auckland

E-mail jliu070aucklanduniacnz

Effective population size (Ne) is a fundamental parameter of interest in ecology It provides information on the genetic viability of a population and is relevant to the conservation of endangered species Linkage disequilibrium due to genetic drift is used to estimate Ne however the performance of the usual estimator can be very poor In this talk I will introduce a new estimator based on a diffusion approximation and use simulations to compare its performance with the existing linkage disequilibrium estimator for Ne

950 - 1030

THURSDAY 3RD DECSession 1 Boardroom EcologyChair Duncan Hedderley

A MULTIVARIATE FEAST AMONG BANDICOOTS AT HEIRISSON PRONG

Teresa Neeman1 and Renee Visser1

1Australian National University

E-mail teresaneemananueduau

Invertebrates plants and skinks are hot menu items for the western barred bandicoot (Perameles bougainville) whose diet was assessed following its reintroduction to Western Australiarsquos Heirisson Prong Analyses of faecal samples from 33 animals captured in winter and 40 animals captured in summer indicate a varying seasonal diet comprising crickets beetles pillbugs ants and other invertebrates The evident seasonal variations could partially be attributed to prey availability Multivariate analyses were used to elucidate diet patterns across seasons

83The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

ENVIRONMENTAL IMPACT ASSESSMENTS A STATISTICAL ENCOUNTER

Dave Saville11Saville Statistical Consulting Ltd Lincoln NZ

E-mail savillestatgmailcom

Recently I was approached by a social scientist who had conducted surveys of the noise and visual impacts of certain public amenities Statistical advice was sought as to how best to summarise and interpret such data and comment was sought on previous attempts at analysis The resulting work raised some interesting issues that I plan to discuss in this talk Since such data are the subject of hearings I shall disguise the context using the fictitious setting of an elephant park with nearby housing developments experiencing the noise impact of trumpeting

84 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1100 THURSDAY 3RD DECInvited Speaker (Swifts) Kaye Basford University of QueenslandChair Lyn Hunt

ORDINATION OF MARKER-TRAIT ASSOCIATION PROFILES FROM LONG-TERM INTERNATIONAL WHEAT

TRIALS

VN Arief1 IH Delacy12 J Crossa3 PM Kroonenberg4 MJ Dieters1 and KE Basford12

1The University of Queensland Australia2Australian Centre for Plant Functional Genomics Australia

3CIMMYT Mexico4Leiden University The Netherlands

E-mail kebasforduqeduau

The ability to use trait associated markers (TAMs) to describe genotypes is essential for selecting genotypes with favourable TAM combinations Each phenome map in the Wheat Phenome Atlas (Arief et al 2008 11th International Wheat Genetics Symposium Brisbane) can be used to obtain three-way three-mode data arrays of genotypes times TAMs times traits The concept of a TAM block defined as markers in a linkage disequilibrium block that show significant association with a trait can be used to reduce the number of TAMs control family-wise error rate and address non-independency of markers in association analysis Three-way principal component analysis was applied to selected genotype times TAM block times trait data arrays based on two phenome maps derived from the analysis of data from the first 25 years of CIMMYT (International Maize and Wheat Improvement Center) Elite Spring Wheat Yield Trials The results showed that this technique can be used to obtain information about which genotypes carry the favourable TAM block combinations which TAM blocks discriminate among genotypes and which TAM block combinations are available for any given combination of genotypes and traits Different patterns of marker trait association profiles are observed when analyzing the same genotypes for different TAM blocks and trait combinations and for data obtained from different combinations of environments The decision on which data set to use depends on the objective of the study and how the results will be used for cross and parental selection

85The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Swifts MedicalChair Ken Beath

FINDING BEST LINEAR COMBINATION OF MARKERS FOR A MEDICAL DIAGNOSTIC WITH RESTRICTED FALSE

POSITIVE RATE

Yuan-chin Chang11Institute of Statistical Science Academia Sinica

E-mail ycchangsinicaedutw

We study the linear combination of markers which usually improves the diagnostic powers of individual markers in terms of the receiver operating characteristic (ROC) curve and its related measures such as the whole and partial area under the curve say AUC and pAUC respectively In some medical diagnostic to confine the false positive rate within a specific range is needed that makes the pAUC is a reasonable choice under such a circumstance Thus we emphasize pAUC here and both parametric and nonparametric methods are discussed

The parametric approach is an extension of Su and Liu (1993 JASA) We found that the linear combination vector of markers that maximizes the partial AUC is

lp = (wDSD + wDSD)-1 (mD ndash mD)

where mD SD and mD SD are mean vectors and covariance matrices of disease and non-disease groups respectively and coefficients wD wD ϵ R

1 depends on the given specificity and are also functions of lp Thus the solution of lp requires some iteration procedure We apply it to the data set of Liu et al (2005 Stat in Med) and the numerical results show that our method outperforms that of Liu et al (2005 Stat in Med)

Moreover for the case with large number of markers we follow the idea of Wang et al (2007 Bioinformatics) and propose a LARS-like algorithm with different objective function to find a linear combination that maximizes pAUC This method can be applied to the problems whose markers outnumbers subjects Some large sample properties of this method are derived We then apply it some real data sets and the results are very promising which locate different markers that never being found via AUC-based methods

86 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

A MODIFIED COMBINATION TEST FOR THE ANALYSIS OF CLINICAL TRIALS

Markus Neuhaumluser11Rhein Ahr Campus

E-mail neuhaeuserrheinahrcampusde

In clinical trials protocol amendments that change the inclusion criteria are sometimes required Then the patient populations before and after the amendment may differ Lősch amp Neuhaumluser (2008) proposed to perform separate statistical tests for the different phases ie for the patients recruited before and after the amendment and to combine the tests using Fisherrsquos combination test According to simulations the proposed combination approach can be superior to the lsquonaiumlversquo strategy to ignore the differences between the phases and pooling the data (Lősch amp Neuhaumluser 2008) In contrast to a clinical study with an adaptive interim analysis blinding can be maintained during the study both phases are analysed at the end of the study Therefore an asymmetric decision rule as proposed by Bauer amp Kőhne (1994) for adaptive designs is not appropriate Instead both a0 and a1 can be applied to the p-values of both phases Thus the modified combination test is significant if max(p1 p2) le a1 or if max(p1 p2) le a0 and p1 p2 le ca Of course other values for a0 and a1 result than in Bauer amp Kőhne (1994) For example if a = 005 and a0 = 05 then a1 equals 01793 The proposed modified combination test can also be useful in adaptive designs when a stop after the first phase with rejection of the null hypothesis is not desired A stop for futility is still possible in case of p1 gt a0 Moreover the modified combination test may be useful for the analysis of multicentre studies Simulations indicate that in realistic scenarios the modified combination test is superior to both Fisherrsquos original combination test and Bauer amp Kőhnersquos method

Bauer amp Kőhne (1994 Biometrics 50 1029-41) Lősch amp Neuhaumluser (2008 BMC Medical Research Methodology 8 16)

87The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

1140 - 1220

THURSDAY 3RD DECSession 2 Boardroom GeneticsChair Julian Taylor

BELIEVING IN MAGIC VALIDATION OF A NOVEL EXPERIMENTAL BREEDING DESIGN

Emma Huang1 2 Colin Cavanagh2 3 Matthew Morell2 3 and Andrew George1 2

1CSIRO Mathematics Informatics and Statistics2CSIRO Food Futures National Research Flagship

3CSIRO Plant Industry

E-mail EmmaHuangcsiroau

Multiparent Advanced Generation InterCrosses (MAGIC) are an intriguing new experimental breeding design which represent phenotypic and genotypic diversity from across a population The design was proposed in order to enable the detection of significant genomic regions with far greater precision than previously possible A crucial requirement for this is a dense and accurate marker map on which gene-trait associations can be localised The extended generations of breeding and larger founder population of MAGIC increase the theoretical resolution of such a map However marker map construction in complex experimental crosses is nontrivial

At the heart of map construction is the calculation of marker genotype probabilities In bi-parental crosses such as backcrosses and F2 designs these probabilities are easily calculated because the latent marker genotypes are directly inferred from the observed marker values In MAGIC populations due to multiple founders and intermediate generations being unobserved unambiguously inferring the marker genotypes is often no longer possible Consequently current map building software cannot be used for map construction in MAGIC populations

We have developed methodology and software to handle marker map construction in MAGIC using dominant and codominant markers These tools have been tested through extensive simulations They perform well in comparison to published maps when applied to real data from the worldrsquos first MAGIC in wheat The results thus validate the theoretical superiority of the cross relative to standard designs for marker map construction and QTL mapping

88 The International Biometric Society Australasian Region Conference

Oral Presentation Abstracts

PHENOTYPES FOR TRAINING AND VALIDATION OF WHOLE GENOME SELECTION METHODS

Ken Dodds1 Benoit Auvray1 Peter Amer2 Sheryl-Anne Newman1 and Sheryl-Anne McEwan

1AgResearch Invermay Agricultural Centre Mosgiel New Zealand2AbacusBio Limited Dunedin New Zealand

E-mail kendoddsagresearchconz

Whole genome selection endeavours to predict the genetic worth of individuals based on a dense set of genetic markers across the genome The prediction equations are developed on a set of individuals previously genotyped and with sufficient phenotype information To gauge the usefulness of the genomic prediction this set of individuals may be divided into internal lsquotrainingrsquo and lsquovalidationrsquo sets This is commonly done on the basis of age whereby older individuals are used to predict genetic worth of younger individuals as whole genome selection is normally used to predict the worth of younger individuals with little or no phenotype information

One issue that arises is deciding on an appropriate phenotype measurement This may be a measurement on the individuals that are genotyped or it might be a combination of information from the individual and their relatives An example of the latter is the use of estimated breeding values for bulls in the dairy industry However it is important that the phenotypes in the training and validation sets are independent For the dairy example this will almost be true if each bull has a large number of measured offspring as in this case the estimated breeding value depends almost entirely on the offspring In applying these methods to the sheep industry we find a wide range of offspring numbers We discuss methods to determine useful training and validation sets and appropriate phenotypes for datasets such as those in the sheep industry

89The International Biometric Society Australasian Region Conference

POSTER PRESENTATION ABSTRACTS

Listed Alphabetically by Submitting Author

WHY BE A BAYESIAN

Ruth Butler11Plant and Food Research

E-mail RuthButlerplantandfoodconz

Bayesian approaches to data analysis have been increasingly common particularly in the last decade This is in part because the required computing power is now widely available and in part because there has been an increase in the number of problems that are difficult or intractable with so-called classical approaches that can be tackled with Bayesian methods Such problems include the very large data-sets with complex structure that are being generated by geneticists and molecular biologists The Bayesian approach is also increasingly attractive to many users of statistics because the basic idea of the approach is that prior information (or beliefs) can be explicitly included in the model and because of increasing dissatisfaction with the meaning of inferences that can be drawn from a classical statistical approach Bayesian inference in general has interpretations that are assumed but often not valid for a classical inference For example p values are often in terpreted in a classical analysis as giving 1- probability that the null hypothesis is true given the data whereas in fact this cannot be obtained from a p value without other information A Bayesian posterior distribution can directly give the required probability (Matthews) In this poster Bayesian concepts are explored from a practitionerrsquos point of view including a discussion of motivations for applying a Bayesian approach Bayesian and classical analyses of two data sets are compared

1 Matthews (2001 J Stat Plan Inf 94 43-58)

90 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

MIXED MODEL ASSESSMENT OF THE EFFECTS OF WEANER HEIFER MANAGEMENT ON PLASMA

PROGESTERONE

Margaret Carr1 Tony Swain2 Olena Kravchuk1 and Geoffry Fordyce2

1School of Land Crop and Food Sciences University of Queensland Qld

Australia2Queensland Primary Industries and Fisheries Qld Australia

E-mail okravchukuqeduau

The strong seasonal cycle in North Queensland pasture nutritive value affects size and maturity of breeding cattle causing low reproduction rates A 6-year study trialed Bos indicus cross heifer treatments that could potentially improve reproductive performance In four years plasma progesterone (P4) was measured over time as an indicator of maturity The treatment factors included weaner size nutrition level and Androstenedione (A4) vaccination (which may advance puberty)

This analysis investigated treatment effects on P4 during the post-weaning dry season (age 4 to 11 months) The experimental design was unbalanced with changes to treatments over years so the analysis was carried out separately for each of the four years (1990 to 1993) P4 was recorded 5 to 18 times during each heiferrsquos first year with some missing data (lt57) and outliers Variances increased with P4 level so the P4 values were log transformed for analysis

The model fitted by REML in GenStat included paddock and heifer effects as random terms and unequal variances for the repeated measures with heiferday variance-covariance modelled as the direct product of identity and diagonal matrices (Carr et al Australasian IBS conference 2007) Manual backward elimination was used to remove non-significant (P gt 001) fixed effect interactions The analysis showed different effects between years The interaction vaccineweaner sizetime was only significant in 1992 The interaction vaccinetime was significant in 1990 and 1993 In 1991 no interactions or main effects were significant In general A4 vaccine significantly increased P4 levels immediately after vaccinations but the vaccine effect diminished as heifers aged The interaction nutritionweaner sizetime was significant in 1990

Overall the linear mixed model with backward elimination of non-significant terms allowed us to appropriately compare and estimate treatment effects in this study with a complicated unbalanced repeated measures design

91The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

USING BOOSTED REGRESSION TREES TO ASCERTAIN THE EFFECT OF LANDSCAPE

Patrick Connolly1 David Logan1 and Garry Hill11The New Zealand Institute for Plant and Food Research Limited

E-mail patrickconnollyplantandfoodconz

The number of insects found in any particular location can depend on a very large number of variables and many of those could interact with one another One method of showing such interactions is the regression tree However though it gives a good representation of the relationship between the variables such representations are not very stable Omitting a single data point can result in a substantially different picture being created

The Boosted Regression Tree (BRT) method is a machine learning approach which produces a series of regression trees with each successive tree being built to fit the residuals of the preceding tree Using the R package gbm typically half of the data is used to establish the tree which is used to predict the other half of the data By examining the predictive ability of several thousands of trees produced in this manner the relative influence of the variables can be ranked

Analysing the abundance of two cicada species in kiwifruit orchards showed that they were influenced by largely different sets of variables Such a result was anticipated by entomologists whose observations indicate that locations tend to favour one species over the other Restricting the number of variables to the 12 most influential ones in each case tends to produce better predictions for large values but somewhat more likely to overpredict small values The models can be used to predict abundance for other locations though they wonrsquot have simple sets of coefficients which could be used in spreadsheet calculations

92 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

ANALYSIS OF PARASITE COUNTS IN NEW ZEALAND LIVESTOCK

John Koolaard1 Dongwen Luo1 and Fred Potter1

1AgResearch Limited

E-mail johnkoolaardagresearchconz

Internal parasites are a major problem for the New Zealand livestock industries We present an analysis of parasite numbers in the faeces of sheep and in the soil and grass surrounding the faeces The data come from an 18 month trial at several locations and incorporates a longitudinal aspect as well as serious zero-inflation and heterogeneity issues

FITTING MIXTURE MODELS TO THE DISTRIBUTIONS OF SWEET-POTATO STARCH PARTICLES AFTER HAMMER

MILLING

Olena Kravchuk1 and Peter Sopade2

1School of Land Crop and Food Sciences University of Queensland2Centre for Nutrition and Food Sciences University of Queensland

E-mail okravchukuqeduau

Various distributions of particles of sweet-potato starch were created under different regimes of hammer milling in an experiment which was investigating the digestibility of the sweet-potato flour The experimental factors were the size of the retention sieve and the number of re-grinding passes through the hammer mill Twenty four flour samples were produced altogether from six milling settings (two replicates and two sub-samples from each) The distributional data were generated by a particle size analyser (Malvern Instruments Ltd Malvern WR14 1XZ UK) in a form of a long binned array of volumetric percentages The particle size distributions were obviously and in a complex way changing with changes in the milling energy The average volumetric diameter alone was not an adequate summary for the distributions It was thus required to construct a tailored algorithm for summarizing the distributional changes and relating them to the digestion rates of flours mea sured subsequently Here we report a comparative analysis of the goodness-of-fit of three-component mixtures of log-Normal and two-parameter Weibull distributions and relate the changes in the parameters of the models to the settings of the mill

93The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

PENALIZED REGRESSION TECHNIQUES FOR PREDICTION A CASE-STUDY FOR PREDICTING TREE MORTALITY USING REMOTELY-SENSED VEGETATION

INDICES

David Lazaridis1 Jan Verbesselt2 and Andrew Robinson3

1Student The University of Melbourne2Remote sensing team CSIRO Sustainable Ecosystems

3Senior Lecturer The University of Melbourne

E-mail davidlazgmailcom

This paper reviews a variety of methods for constructing regression models We focus on predicting tree mortality from change metrics derived from Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite images The high dimensionality and collinearity inherent in such data are of particular concern Standard regression techniques perform poorly for such data so we examine shrinkage regression techniques such Ridge Regression the LASSO and Partial Least Squares which yield more robust predictions We also suggest efficient strategies that can be used to select optimal models such as 0632+ Bootstrap and Generalized Cross Validation (GCV) The techniques are compared using simulations The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata plantation in southern New South Wales Australia and their prediction performances are compared We find that shrinkage regression techniques outperform the standard methods with ridge regression and the LASSO performing particularly well

94 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

CAUTION COMPOSITIONS CAN CONSTRAINTS ON lsquoOMICSrsquo LEAD ANALYSES ASTRAY

Warren Muller1 David Lovell1 Jen Taylor2 and Alec Zwart1

1CSIRO Mathematics Informatics and Statistics Canberra Australia2CSIRO Plant Industry Canberra Australia

E-mail warrenmullercsiroau

Some DNA or RNA sequencing methods produce data that can be considered as counts of the number of times each sequence was observed in the biological sample Because the sum of these counts within each sample is constrained by the sequencing process and the physical amount of the sample they constitute compositional data (Aitchison 1986 Chapman and Hall 416pp) There are many other examples of compositional data in the lsquoomicsrsquo including relative abundances of species (in metagenomics) or gene ontology (GO) terms (in functional genomics)

Few researchers have broached the issue of analysis of compositional data in lsquoomicsrsquo count surveys but in the geosciences there has been debate for nearly half a century about how sum-constrained data should be analysed One important difference between lsquoomicsrsquo and geosciences data is that molecular biology frequently produces composition with tens- if not hundreds-of-thousands of components

We aim to raise awareness of whether and in what circumstances naiumlve analysis of sum-constrained data could lead to incorrect inference in the lsquoomicsrsquo and explore the extent to which this might be a problem in applications In particular we compare the analysis of log-transformed data to full compositional data analysis

95The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

GOODNESS OF FIT OF GAMMA DISTRIBUTION TO SEA FAUNA WEIGHTS

Mayumi Naka1 and Ritei Shibata1

1Keio University Yokohama Japan

E-mail nakastatmathkeioacjp

Gamma distribution is expected to be a good distribution for sea fauna weights since it is characterized by a stochastic differential equation for the growth We have investigated the validity of the gamma distribution as a model with the NPF trawling experiment data By using a Probability-Probability plot as a visual tool of validation we found that the maximum likelihood estimate does not always yield good fit This is probably because the gamma distribution is very sensitive to a shift of shape parameter particularly when the parameter is relatively small It is partly explained by a sensitivity analysis of gamma distribution As an alternative estimate we have employed a minimum squares type estimate of the parameters on the Probability-Probability plot It worked well and the identified gamma distribution shows good fit to the data except for one species out of 83 species

96 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

TRENDS OF CREATININE AMONG PATIENTS WITH TYPE 2 DIABETES IN ISFAHAN ENDOCRINE AND

METABOLISM RESEARCH CENTER (IEMRC) BETWEEN 1997-2007 A LONGITUDINAL STUDY

Sareh Rousta1 Sayed Mohsen Hosseini1 and Masoud Amini11Department of Biostatistics and Epidemiology Faculty of HealthIsfahan

University of Medical Sciences

E-mail roustasaragmailcom

Introduction Diabetes is one of the most prevalent leading causes of death in the world Diabetes complications like kidney disease make patients much pain and costs Creatinine test is one way to evaluate kidney function This study designed Because of importance of serum creatinine level changes over time and lack of longitudinal data and information in this tendency in Iran and power of statistical model which used to analysis longitudinal data

Materials and Methods This investigation is an ongoing cohort that utilized patientsrsquo files with type 2 diabetes that have taken part to Isfahan Metabolic and Endocrine Research center from 1997 to 2007This information collected longitudinally We used linear mixed-effects models to analysis the data Results Results of linear mixed effects model showed that there are a significant association between creatinine changes over time and sex age diabetes duration BUN FBS systolic blood pressure

Conclusion Information that this study provides can be used to identify groups with high risk of renal dysfunction

Key Words Longitudinal study Mixed effect models Creatinine Type 2 diabetes

97The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

THE RISK FACTORS OF POSTOPERATIVE COMPLICATIONS IN PATIENT UNDERGONE ISOLATED

CORONARY ARTERY BYPASS GRAFT SURGERY

Masoumeh Sanagou1 Baki Billah2 and Christopher Reid2

1PhD student in biostatistics Department of2Department of Epidemiology and Preventive Medicine Monash University

Victoria Australia

E-mail Msanagouyahoocom

Purpose This study was conducted to determine risk factors for two postoperative complications new renal failure and stroke (any CVA) for isolated coronary artery bypass graft (CABG) surgery for an Australian population and to develop risk prediction models Background Most studies on postoperative complications for isolated coronary artery bypass graft (CABG) surgeries from one population may not sufficiently reflect clinical characteristics and outcomes when sought to other populations The present study investigates risk factors for postoperative complications in CABG surgeries for an Australian population because there is no model developed in the Australian context Methods The data was collected from July 2001 to June 2008 from 14 public hospitals in Australia 14533 patients underwent isolated CABG during this period The data was divided into two sets model creation (60) and model validation (40) sets The data in the creation set was used to develop the model and then the validation set was used to validate the model Using simple logistic regression risk factors with p-valuelt010 were identified as plausible risk factors and then entered into multiple logistic regression Bootstrapping and stepwise backward elimination methods were used to select significant risk factors The model discrimination and calibration were evaluated using ROC and Hosmer-Lemeshow p-value respectively Results Among 14533 patients underwent CABG over an 8-year period 778 were men and 222 were women The mean (SD) age of the patients was 657 (103) Two postoperative complications are new renal failure with 365 and stroke with 138 The variables identified as risk factors are as follows New renal failure age gender intra-aortic balloon pump previous vascular disease preoperative dialysis ejection fraction estimate CPB time cardiogenic shock ventricular assist device (ROC=070 H-L=lt0001) Stroke cerebrovascular disease ventricular assist device age gender CPB time previous vascular disease preoperative dialysis urgency of procedure (ROC=073 H-L=lt0001) Conclusion We have identified risk factors for two major postoperative complications for CABG surgery

98 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

SOME ISSUES IN MODELLING BIODIVERSITY USING SPATIALLY MODELLED COVARIATES

Scott Foster1 and Hideyasu Shimadzu21CSIRO Wealth from Oceans Flagship and CSIRO Mathematics Informatics

and Statistics2Marine and Coastal Environment Group Geoscience Australia

E-mail hideyasushimadzugagovau

Investigating how biodiversity varies with the environment is a topic that is currently receiving much attention in ecological science The approach often taken is to perform some sort regression-type analysis where physical variables are covariates However those physical variables are most commonly not measured at the same locations as the biological data Instead they are usually point predictions from spatial models from auxiliary data sources It is not clear what kind of effects the modelled covariates will have on the model although simple approximations for simple models do give indications We have performed some simulation studies to investigate the manner of the effect and its potential size The simulations are based on real physical and biological data from the Great Barrier Reef Lagoon

99The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

COMBINATIONS OF GENE VARIANTS WHICH ARE TYPICAL FOR SEPTIC STATE OF CHILDREN PATIENTS

Michal Smerek1 Jaroslav Michalek1 and Jaroslav Michalek2

1University of Defence Brno2UCBI Masaryk University Brno

E-mail michalsmerekunobcz

Sepsis represents the main cause of mortality at intensive care units The aim of the article is to identify the dependency structure of genes variants which have influence on septic states of children patients

The data set contains the data of 580 pediatric patients of the University Hospital Brno Czech Republic and 641 healthy people 12 genes (CD14 BPI 216 Il-6 176 LBP 098 IL-RA TLR 399 TLR 299 BPI Taq LBP 429 IL 634 HSP70 and TNF beta) were observed The statistically significant differences between healthy group and septic group were observed in variants of genes TLR 399 TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 The result were published in [123] To identify the role of different combinations of gene variants and to describe the differences in frequencies of gene variants combination between the study groups the genes TLR 299 BPI Taq LBP 429 Il-6 176 and HSP70 were only used The gene TLR 399 has not been used in the analysis because of its high association with the gene TLR299 This way it was possible to create the 5 dimensional contingence table with reasonable high frequencies and to perform the statistical analysis based on hierarchical and graphic al log-linear models The result of the analysis were the hierarchical models of association structure for chosen genes in healthy group and in septic patients group Then the typical combinations of gene variants for healthy group and for septic patients group has been found The result nicely corresponds to the results published in [1 2 3] for individual genes and enables to recognize the typical combination of variants of six genes on which the focus of attention should be concentrated

References [1] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A Bactericidal permeability increasing protein gene variants in children with sepsis Intensive Care Medicine ISSN 0342-4642 2007 vol 33 s 2158-2164 [2] Svetlikova P Fornusek M Fedora M Klapacov a L Bartosova D Hrstkova H Klimovic M Novotna E Hubacek JA Michalek J Sepsis Characteristics in Children with Sepsis [in czech abstract in english] In Cesko-Slovenska Pediatrie 59 p 632-636 2004 [3] Michalek J Svetlikova P Fedora P Klimovic M Klapacova L Bartonova D Hrstkova H Hubacek J A J Interleukine - 6 gene variants and the risk of sepsis development in children Human Immunology ELSEVIER SCIENCE INC ISSN 0198-8859 2007 vol 68 pp 756 - 760

100 The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IMPROVED ESTIMATES OF DISPERSAL RATES OF POTATO TUBER MOTH USING A FIELD SCALE MARK-CAPTURE

TECHNIQUE

Andrew R Wallace1 PJ Cameron2 PJ Wigley3 S Elliott3 S Madhusudan JAD Anderson1 and GP Walker1

1NZ Institute for Plant and Food Research Ltd220 Westminster Rd Auckland 1024

3BioDiscovery New Zealand Ltd

E-mail andrewwallaceplantandfoodconz

A mark-capture technique involving field application with a tractor-mounted boom sprayer of Bacillus thuringiensis Berliner (Bt) was developed to mark large numbers of potato tuber moth Phthorimaea operculella (Zeller) (Lepidoptera Gelechiidae) and tested in field experiments over three seasons in the Pukekohe vegetable growing region of South Auckland New Zealand Moths captured over 3 days were assayed for the Bt marker having established persistence of the marking checked self-marking and cross-contamination rates and confirmed the absence of background (natural) Bt Marking rates of captured moths were 78-100 in the sprayed fields and compared with previous mark-release-recapture studies marking at ca 200 m away from the fields (or release point for earlier work) was increased by 15-18 fold to gt30 moths per trap Moths were caught up to 750 m from sprayed fields This capture rate enabled improved estimates of a dispersal curve for mean trap catch vs distance with a common curvature parameter to be fitted Fitting was via a generalised linear model with a logarithmic link function and Poisson error distribution Both radial and linear dispersal models for moth dispersal in two dimensions were then fitted and the average distance travelled in 3 days was calculated for each model The distance c within which a given proportion of the moths remained was also estimated viz for linear dispersal

( ) ( ) ( )exp 1 1 0b c b c p- + - - =

where b is estimated from the dispersal curve The estimates indi-cated that 10 of the population dispersed further than 240 m in 3 days This level of dispersal suggests that measures to restrict the spread of and manage potato tuber moth populations especially if insecticide resistance is present should be applied over whole grow-ing regions not within smaller farm-scale areas

101The International Biometric Society Australasian Region Conference

Poster Presentation Abstracts

IDENTIFYING UNSTABLE PRETERM INFANTS USING MEASURES OF OXYGEN SATURATION IN DIFFERENT

BEHAVIOURAL STATES

Marina Zahari1 Dominic Lee1 Glynn Russell2 Brian Darlow3 Brian Scarrott and Marco Reale1

1University of Canterbury2Imperial College London

3University of Otago

E-mail mza19studentcanterburyacnz

Oxygen saturation levels are routinely monitored for preterm babies in the neonatal intensive-care unit to help detect cardio-respiratory instabilities In an earlier study (Zahari et al 2008) we have shown that an appropriate combination of the mean level and variability of oxygen saturation in the form of the coefficient of variation may be useful for detecting instabilities In that study all oxygen saturation measurements across different behavioural states were combined together In this study involving 17 healthy preterm babies we isolate three behavioural states (active sleep quiet sleep and others) and obtain oxygen saturation measures (mean standard deviation and coefficient of variation) for equal length segments of measurements in each state Hierarchical clustering of the empirical distribution functions for each measure is used to put the babies into two groups (stable versus unstable) With the aid of a cluster validity index results show that the clustering based on active sleep segments perform better than the other states Furthermore clustering based on the standard deviation is superior to the mean but clustering based on the coefficient of variation is the best of the three measures

Zahari M Lee DS Russell G et al (2008 Australian Statistical Conference 2008)

102 The International Biometric Society Australasian Region Conference

Index of Presenting Authors

Arnold R 33Asghari M 26Baird D 30Balinas V 46Barnes M 27Basford KE 84Beath K 51Bland JM 49Briggs J 76Burridge C 28Burzykowski T 73Butler R (poster) 89Campain A 56Chang K 70Chang Y 85Chee C 77Clark ASS 36Clarke S 74Clifford D 72Connolly P (poster) 91Cui J 55DrsquoAntuono M 67Darnell R (1) 35Darnell R (2) 47Davy M 40Day S 43Ding P 69Dobbie M 48Dodds K 88Fewster R 37Forrester R 34Ganesalingam S 79Ganesh S 78Gatpatan JMC 48Graham M 33Graham P 65Huang E 87Hwang J 57Ihaka R 36Jones G 45Kifley A 53Kipnis V 61Koolaard J (poster) 92Kravchuk O (poster 1) 90Kravchuk O (poster 2) 92Lazaridis D 93

Le Cao K 81Littlejohn R 67Liu I 66Liu J 82Lumley T 59Marschner I 52Matthews D 58McLachlan A 44Meyer D 68Meyer R 75Mohebbi M 38Mueller S 47Muller W (poster) 94Naka M (poster) 95Neeman T 82Neuhaumluser M 86Orellana L 54Park E 64Park Z 42Pirie M 32Poppe K 71Rousta S (poster) 96Ruggiero K 71Ryan L 25Sanagou M (poster) 97Saville D 83Scott A 60Shimadzu H 62Shimadzu H (poster) 98Sibanda N 76Smerek M (poster) 99Smith AB 69Stewart M 77Stojanovski E 31Taylor J 41Thijs H 50Triggs CM 80Wallace AR (poster) 100Wang Y 29Welsh A 51Williams E 70Yee T 62Yelland L 63Yoon H 39Zahari M (poster) 101

103The International Biometric Society Australasian Region Conference

NOTES

104 The International Biometric Society Australasian Region Conference

NOTES

105The International Biometric Society Australasian Region Conference

NOTES

106 The International Biometric Society Australasian Region Conference

NOTES

107The International Biometric Society Australasian Region Conference

DELEGATES LIST

Name E-mail

Alshiha Abdullah aalshihaksuedusa

Arnold Richard richardarnoldmsorvuwacnz

Baird David davidvsnconz

Ball Rod rodballscionresearchcom

Barnes Mary MaryBarnescsiroau

Basford Kaye kebasforduqeduau

Beath Ken kbeathefsmqeduau

Bland Martin mb55yorkacuk

Briggs Jonathan jbri002stataucklandacnz

Burridge Charis charisburridgecsiroau

Burzykowski Tomasz tomaszburzykowskiuhasseltbe

Butler Ruth RuthButlerplantandfoodconz

Campain Anna annacmathsusydeduau

Cave Vanessa vanessacavegmailcom

Chang Kevin kcha193aucklanduniacnz

Chang Yuan-chin ycchangsinicaedutw

Chee Chew-Seng cheestataucklandacnz

Clark Austina aclarkmathsotagoacnz

Clarke Sandy sjclarkeunimelbeduau

Clifford David davidcliffordcsiroau

Collins Marnie marniecunimelbeduau

Connolly Patrick patrickconnollyplantandfoodconz

Cope Stephen scopeaucklandacnz

Cox Neil neilcoxagresearchconz

Cui Jisheng jishengcuideakineduau

Cunningham Ross RossCunninghamanueduau

Curran James curranstataucklandacnz

Dabin Bilyana bilyanadabingmailcom

DAntuono Mario mdantuonoagricwagovau

Darnell Ross rossdarnellcsiroau

Davy Marcus marcusdavyplantandfoodconz

Day Simon simondayRochecom

Ding Pauline PaulineDinganueduau

Dobbie Melissa melissadobbiecsiroau

Dodds Ken kendoddsagresearchconz

Dow Barbara barbaradowdairyNZconz

Everard Katherine keverardaucklandacnz

Fewster Rachel fewsterstataucklandacnz

Field John johnfieldozemailcomau

Forrester Bob BobForresteranueduau

Galbraith Sally sallygmathsunsweduau

108 The International Biometric Society Australasian Region Conference

Name E-mail

Ganesalingam Selvanayagam sganesalingammasseyacnz

Ganesh Siva sganeshmasseyacnz

Garden Frances francesghealthusydeduau

Graham Patrick patrickgrahamotagoacnz

Graham Michael MichaelGrahamsascom

Hedderley Duncan duncanhedderleyplantandfoodconz

Henderson Harold haroldhendersonagresearchconz

Hepworth Graham hepworthunimelbeduau

Hockey Hans hansbiometricsmatterscom

Hu Bobo bhuaucklandacnz

Huang Emma emmahuangcsiroau

Hunt Lyn lahstatswaikatoacnz

Hwang Jing-Shiang hwangsinicaedutw

Ihaka Ross ihakastatAucklandacnz

Jia Sammie yilinjiaplantandfoodconz

Jones Geoff gjonesmasseyacnz

Jorgensen Murray majwaikatoacnz

Kifley Annette annettekifleystudentsmqeduau

Kipnis Victor kipnisvmailnihgov

Koolaard John johnkoolaardagresearchconz

Kravchuk Olena okravchukuqeduau

Le Cao Kim-Anh klecaouqeduau

Lee Alan leestataucklandacnz

Littlejohn Roger rogerlittlejohnagresearchconz

Liu Ivy iliumsorvuwacnz

Liu Stephen jliu070aucklanduniacnz

Lumley Thomas tlumleyuwashingtonedu

Luo Dongwen dongwenluoagresearchconz

Marschner Ian ianmarschnermqeduau

Matthews David dematthewsuwaterlooca

McArdle Brian bmcardlestataucklandacnz

McDonald Barry BMcDonaldmasseyacnz

McLachlan Andrew AndrewMcLachlanplantandfoodconz

Meyer Denny dmeyerswineduau

Meyer Renate meyerstataucklandacnz

Millar Russell millarstataucklandacnz

Mohebbi Mohammadreza MRMmohebbiyahoocom

Mueller Samuel muellermathsusydeduau

Muller Warren warrenmullercsiroau

Naka Mayumi nakastatmathkeioacjp

Neeman Teresa teresaneemananueduau

Neuhaeuser Markus neuhaeuserrheinahrcampusde

Delegates List

109The International Biometric Society Australasian Region Conference

Delegates List

Name Email

Niven Brian bnivenmathsotagoacnz

OBrien Gabrielle gabrielleobriensascom

Orellana Liliana orellanadmubaar

OSullivan Maree mareeosullivancsiroau

Park Zaneta ZanetaPark-Ngagresearchconz

Park Eunsik espark02gmailcom

Pirie Maryann mpir007aucklanduniacnz

Pledger Shirley shirleypledgervuwacnz

Pledger Megan MeganPledgervuwacnz

Poppe Katrina kpoppeaucklandacnz

Potter Fred fredpotteragresearchconz

Rohan Maheswaran mrohandocgovtnz

Ruggiero Kathy kruggieroaucklandacnz

Ryan Louise LouiseRyancsiroau

Sanagou Masomeh Msanagouyahoocom

Saville Dave savillestatgmailcom

Scott Alastair ascottaucklandacnz

Shimadzu Hideyasu hideyasushimadzugagovau

Sibanda Nokuthaba nsibandamsorvuwacnz

Smerek Michal michalsmerekunobcz

Smith Alison alisonsmithindustrynswgovau

Stewart Michael mstewartusydeduau

Stojanovski Elizabeth ElizabethStojanovskinewcastleeduau

Taylor Julian juliantaylorcsiroau

Thijs Herbert herbertthijsuhasseltbe

Triggs Chris cmtriggsaucklandacnz

Upsdell Martin martinupsdellagresearchconz

van Koten Chikako chikakovankotenagresearchconz

Wallace Andrew andrewwallaceplantandfoodconz

Waller John johnwalleragresearchconz

Wang You-Gan you-ganwangcsiroau

Wang Yong yongwangstataucklandacnz

Welsh Alan AlanWelshanueduau

Wild Chris wildstataucklandacnz

Williams Emlyn emlynwilliamsanueduau

Yee Thomas yeestataucklandacnz

Yelland Lisa lisayellandadelaideeduau

Yoon Hwan-Jin hwan-jinyoonanueduau

Yoshihara Motoko motokoyoshihararochecom

Yuan Verina verinayuanagresearchconz

Zahari Marina mza19studentcanterburyacnz

Page 12: here - Conferences @FOS - The University of Auckland
Page 13: here - Conferences @FOS - The University of Auckland
Page 14: here - Conferences @FOS - The University of Auckland
Page 15: here - Conferences @FOS - The University of Auckland
Page 16: here - Conferences @FOS - The University of Auckland
Page 17: here - Conferences @FOS - The University of Auckland
Page 18: here - Conferences @FOS - The University of Auckland
Page 19: here - Conferences @FOS - The University of Auckland
Page 20: here - Conferences @FOS - The University of Auckland
Page 21: here - Conferences @FOS - The University of Auckland
Page 22: here - Conferences @FOS - The University of Auckland
Page 23: here - Conferences @FOS - The University of Auckland
Page 24: here - Conferences @FOS - The University of Auckland
Page 25: here - Conferences @FOS - The University of Auckland
Page 26: here - Conferences @FOS - The University of Auckland
Page 27: here - Conferences @FOS - The University of Auckland
Page 28: here - Conferences @FOS - The University of Auckland
Page 29: here - Conferences @FOS - The University of Auckland
Page 30: here - Conferences @FOS - The University of Auckland
Page 31: here - Conferences @FOS - The University of Auckland
Page 32: here - Conferences @FOS - The University of Auckland
Page 33: here - Conferences @FOS - The University of Auckland
Page 34: here - Conferences @FOS - The University of Auckland
Page 35: here - Conferences @FOS - The University of Auckland
Page 36: here - Conferences @FOS - The University of Auckland
Page 37: here - Conferences @FOS - The University of Auckland
Page 38: here - Conferences @FOS - The University of Auckland
Page 39: here - Conferences @FOS - The University of Auckland
Page 40: here - Conferences @FOS - The University of Auckland
Page 41: here - Conferences @FOS - The University of Auckland
Page 42: here - Conferences @FOS - The University of Auckland
Page 43: here - Conferences @FOS - The University of Auckland
Page 44: here - Conferences @FOS - The University of Auckland
Page 45: here - Conferences @FOS - The University of Auckland
Page 46: here - Conferences @FOS - The University of Auckland
Page 47: here - Conferences @FOS - The University of Auckland
Page 48: here - Conferences @FOS - The University of Auckland
Page 49: here - Conferences @FOS - The University of Auckland
Page 50: here - Conferences @FOS - The University of Auckland
Page 51: here - Conferences @FOS - The University of Auckland
Page 52: here - Conferences @FOS - The University of Auckland
Page 53: here - Conferences @FOS - The University of Auckland
Page 54: here - Conferences @FOS - The University of Auckland
Page 55: here - Conferences @FOS - The University of Auckland
Page 56: here - Conferences @FOS - The University of Auckland
Page 57: here - Conferences @FOS - The University of Auckland
Page 58: here - Conferences @FOS - The University of Auckland
Page 59: here - Conferences @FOS - The University of Auckland
Page 60: here - Conferences @FOS - The University of Auckland
Page 61: here - Conferences @FOS - The University of Auckland
Page 62: here - Conferences @FOS - The University of Auckland
Page 63: here - Conferences @FOS - The University of Auckland
Page 64: here - Conferences @FOS - The University of Auckland
Page 65: here - Conferences @FOS - The University of Auckland
Page 66: here - Conferences @FOS - The University of Auckland
Page 67: here - Conferences @FOS - The University of Auckland
Page 68: here - Conferences @FOS - The University of Auckland
Page 69: here - Conferences @FOS - The University of Auckland
Page 70: here - Conferences @FOS - The University of Auckland
Page 71: here - Conferences @FOS - The University of Auckland
Page 72: here - Conferences @FOS - The University of Auckland
Page 73: here - Conferences @FOS - The University of Auckland
Page 74: here - Conferences @FOS - The University of Auckland
Page 75: here - Conferences @FOS - The University of Auckland
Page 76: here - Conferences @FOS - The University of Auckland
Page 77: here - Conferences @FOS - The University of Auckland
Page 78: here - Conferences @FOS - The University of Auckland
Page 79: here - Conferences @FOS - The University of Auckland
Page 80: here - Conferences @FOS - The University of Auckland
Page 81: here - Conferences @FOS - The University of Auckland
Page 82: here - Conferences @FOS - The University of Auckland
Page 83: here - Conferences @FOS - The University of Auckland
Page 84: here - Conferences @FOS - The University of Auckland
Page 85: here - Conferences @FOS - The University of Auckland
Page 86: here - Conferences @FOS - The University of Auckland
Page 87: here - Conferences @FOS - The University of Auckland
Page 88: here - Conferences @FOS - The University of Auckland
Page 89: here - Conferences @FOS - The University of Auckland
Page 90: here - Conferences @FOS - The University of Auckland
Page 91: here - Conferences @FOS - The University of Auckland
Page 92: here - Conferences @FOS - The University of Auckland
Page 93: here - Conferences @FOS - The University of Auckland
Page 94: here - Conferences @FOS - The University of Auckland
Page 95: here - Conferences @FOS - The University of Auckland
Page 96: here - Conferences @FOS - The University of Auckland
Page 97: here - Conferences @FOS - The University of Auckland
Page 98: here - Conferences @FOS - The University of Auckland
Page 99: here - Conferences @FOS - The University of Auckland
Page 100: here - Conferences @FOS - The University of Auckland
Page 101: here - Conferences @FOS - The University of Auckland
Page 102: here - Conferences @FOS - The University of Auckland
Page 103: here - Conferences @FOS - The University of Auckland
Page 104: here - Conferences @FOS - The University of Auckland
Page 105: here - Conferences @FOS - The University of Auckland
Page 106: here - Conferences @FOS - The University of Auckland
Page 107: here - Conferences @FOS - The University of Auckland
Page 108: here - Conferences @FOS - The University of Auckland
Page 109: here - Conferences @FOS - The University of Auckland
Page 110: here - Conferences @FOS - The University of Auckland
Page 111: here - Conferences @FOS - The University of Auckland
Page 112: here - Conferences @FOS - The University of Auckland

Recommended