ACCRE R9 Cluster Quick and Dirty Status
Report generated at Thu Oct 23 03:23:01 PM CDT 2025
Problem Nodes
HOSTNAMES STATE AVAIL_FEATURES TIMESTAMP USER REASON
cn0001 drained x86-64v3 2025-05-22T15:29:13 appelte1 Sam - NA - Hammerspace testing, decom when done
cn1271 draining intel_e5-2630_v3,haswell,intel 2025-10-22T15:53:27 root Kill task failed
cn1372 draining intel_e5-2630_v3,haswell,intel 2025-10-21T21:19:25 root Kill task failed
cn1451 drained* intel_4116,skylake,intel,x86-6 2025-09-10T14:19:11 broadrt Troy - RT93447 - backplane connector issue
cn1484 draining intel_4110,skylake,intel,x86-6 2025-10-21T23:35:38 root Kill task failed
cn1539 drained intel_5218,cascadelake,intel,x 2025-02-05T16:04:39 root Alan - RT N/A - NO TOUCHY?
cn1578 down* intel_5218,cascadelake,intel,x 2025-10-22T15:01:56 slurm Not responding
cn1626 drained amd_7313,zen2,zen,amd,x86-64v3 2025-10-01T15:30:41 broadrt Troy - RT95231 - repeated down; not responding
cn1706 draining amd_9754,zen4,zen,amd,x86-64v4 2025-10-23T14:20:40 root Kill task failed
dgx04 drained dgx,a100_80gb,amd_7742,zen2,ze 2025-10-22T14:50:48 goffta1 Thomas - RTtbd - Create reservation for DSI
gpu0024 drained* pascal,p3840,intel_e5-2623_v4, 2025-10-16T16:04:55 broadrt Troy - RT95520 - P2-DIMMH1, decom
gpu0035 drained* turing,intel_5118,skylake,inte 2025-09-29T09:49:11 broadrt Troy - RT94482 - CPU Voltage issue
gpu0036 drained* turing,intel_5118,skylake,inte 2025-09-29T14:05:42 broadrt Troy - RT91761 - System instability, voltage
gpu0037 drained turing,intel_5118,skylake,inte 2025-09-29T14:06:35 broadrt Troy - RT91524 - frequent reboots
gpu0038 inval turing,intel_5118,skylake,inte 2025-10-23T14:05:38 slurm gres/gpu count reported lower than configured (0 < 4
gpu0040 inval turing,intel_5118,skylake,inte 2025-10-23T15:02:16 slurm Low RealMemory (reported:353823 < 100.00% of configu
gpu0041 drained* turing,intel_5118,skylake,inte 2025-09-29T14:25:00 broadrt Troy - RT86927 - Memory and GPU issues : Not respond
gpu0044 drained* turing,intel_5118,skylake,inte 2025-09-29T14:28:50 broadrt Troy - RT91494 - Decom per Eric
gpu0045 drained* turing,intel_5118,skylake,inte 2025-09-29T14:29:31 broadrt Troy - Provisioning - resume when R9 and FIXED
gpu0046 inval turing,intel_5118,skylake,inte 2025-10-23T15:02:10 slurm gres/gpu count reported lower than configured (3 < 4
gpu0048 drained* turing,intel_5118,skylake,inte 2025-09-29T14:30:13 broadrt Troy - Provisioning - resume when R9 and FIXED
gpu0049 drained* turing,intel_5118,skylake,inte 2025-09-29T14:30:28 broadrt Troy - gres/gpu count reported lower than configured
gpu0053 drained* turing,intel_5118,skylake,inte 2025-10-07T16:21:06 slurm Troy - gres/gpu count reported lower than configured
gpu0056 drained turing,intel_5118,skylake,inte 2025-10-17T10:11:56 broadrt Troy - RT92146 - cpu voltage issue
gpu0057 drained* turing,intel_4214r,cascadelake 2025-09-16T10:29:50 broadrt Troy - RT91522 - unstable system
gpu0060 inval a4000,amd_7313,zen3,zen,amd,x8 2025-10-23T15:02:10 slurm gres/gpu count reported lower than configured (0 < 8
gpu0061 drained a4000,amd_7313,zen3,zen,amd,x8 2025-09-29T14:31:44 broadrt Troy - RT91685 - Imaging and setting /tmp partitioni
gpu0063 draining a6000,intel_platinum_8358,icel 2025-10-23T10:39:16 broadrt Troy - RT95589 - reimage allocating /tmp and /csbtmp
gpu0064 drained a6000,intel_platinum_8358,icel 2025-10-23T15:02:10 slurm gres/gpu count reported lower than configured (0 < 4
gpu0066 drained a6000,intel_platinum_8358,icel 2025-10-23T10:39:27 broadrt Troy - RT95589 - reimage allocating /tmp and /csbtmp
gpu0067 draining a6000,intel_platinum_8358,icel 2025-10-23T10:43:44 broadrt Troy - RT95589 - reimage allocating /tmp and /csbtmp
gpu0068 drained a6000,intel_platinum_8358,icel 2025-10-22T14:38:52 slurm Troy - RT95589 - reimage allocating /tmp and /csbtmp
gpu0084 drained* zen,a100 2025-09-16T09:13:01 root Sai - NA - RMA to Exxact
Queue Summary (Batch)
GROUP USER ACTIVE_JOBS ACTIVE_CORES PENDING_JOBS PENDING_CORES
-----------------------------------------------------------------------------------------
accre_guests 3 16 246 2548
gerritcg 3 16 246 2548
-----------------------------------------------------------------------------------------
aldrich_lab 1 1 0 0
amannn1 1 1 0 0
-----------------------------------------------------------------------------------------
beam_lab 2 32 0 0
zhuj29 2 32 0 0
-----------------------------------------------------------------------------------------
behringer_lab 1 16 0 0
haleof 1 16 0 0
-----------------------------------------------------------------------------------------
bias_group 1 5 0 0
biasds 1 5 0 0
-----------------------------------------------------------------------------------------
booth_lab 2 8 0 0
chenh55 1 4 0 0
mathura 1 4 0 0
-----------------------------------------------------------------------------------------
brg 1 16 0 0
kandelr 1 16 0 0
-----------------------------------------------------------------------------------------
brg_cores 1 16 2 32
kandelr 0 0 2 32
xuy33 1 16 0 0
-----------------------------------------------------------------------------------------
cgg 0 0 1 64
liy110 0 0 1 64
-----------------------------------------------------------------------------------------
cms 85 3253 620 1466
cmslocal 41 2164 290 716
cmspilot 34 1064 328 748
uscmslocal 10 25 2 2
-----------------------------------------------------------------------------------------
coxlab 1 7 0 0
scalica 1 7 0 0
-----------------------------------------------------------------------------------------
cqs_si 0 0 4 8
chenarsw 0 0 4 8
-----------------------------------------------------------------------------------------
davis_lab 0 0 28 110
niarcm2 0 0 26 108
salerl1 0 0 1 1
tsail2 0 0 1 1
-----------------------------------------------------------------------------------------
edwards_lab 1 5 0 0
gorejl1 1 5 0 0
-----------------------------------------------------------------------------------------
escudero_lab 0 0 2000 2000
seifis1 0 0 2000 2000
-----------------------------------------------------------------------------------------
feng_lab 13 52 23 92
jiangl1 13 52 23 92
-----------------------------------------------------------------------------------------
finstata_group 1 20 0 0
hand7 1 20 0 0
-----------------------------------------------------------------------------------------
g_gamazon_lab 3 50 1 8
shetkak 1 2 0 0
wudh 2 48 1 8
-----------------------------------------------------------------------------------------
hadjim_lab 1 4 0 0
reasosa2 1 4 0 0
-----------------------------------------------------------------------------------------
h_biostat_kang 75 75 200 200
yanb1 75 75 200 200
-----------------------------------------------------------------------------------------
h_biostat_student 9 83 320 349
chenh36 0 0 1 30
koy2 4 48 0 0
namy1 1 1 0 0
shil10 2 2 118 118
slonej 0 0 201 201
yih4 2 32 0 0
-----------------------------------------------------------------------------------------
h_cqs 4 46 2 9
shengq1 1 5 2 9
xuh14 2 40 0 0
yangj24 1 1 0 0
-----------------------------------------------------------------------------------------
h_cutting_lab 0 0 526 2104
harrioem 0 0 526 2104
-----------------------------------------------------------------------------------------
h_lu_lab 1 10 0 0
parkj71 1 10 0 0
-----------------------------------------------------------------------------------------
hodges_lab 1 1 0 0
dayj3 1 1 0 0
-----------------------------------------------------------------------------------------
h_oguz_lab 0 0 16 256
wanj119 0 0 16 256
-----------------------------------------------------------------------------------------
h_vmac 6 24 248 1922
contra2 0 0 234 1872
gonzm14 0 0 1 1
waltes4 0 0 1 1
zhanm32 6 24 12 48
-----------------------------------------------------------------------------------------
h_vuiis 2 5 1 1
rogerbp1 0 0 1 1
vuiis_archive 2 5 0 0
-----------------------------------------------------------------------------------------
isde-rer 2 9 0 0
champaca 2 9 0 0
-----------------------------------------------------------------------------------------
kojetin_lab 1 16 0 0
kojetid 1 16 0 0
-----------------------------------------------------------------------------------------
l2_jan_lab 2 15 102 1202
davida7 0 0 100 1200
olivij1 1 7 1 1
zhour8 1 8 1 1
-----------------------------------------------------------------------------------------
l3_aboud_lab 1 64 0 0
hongm1 1 64 0 0
-----------------------------------------------------------------------------------------
l3_precision_nutrition_lab 2 3 1 4
baghem1 2 3 1 4
-----------------------------------------------------------------------------------------
l3_vuiis_cci 1 2 0 0
vuiis_daily_s 1 2 0 0
-----------------------------------------------------------------------------------------
l3_watts_lab 5 20 29 130
rosena 5 20 29 130
-----------------------------------------------------------------------------------------
lea_lab 52 105 72 72
arneram 2 5 0 0
watowm1 50 100 72 72
-----------------------------------------------------------------------------------------
leech_simulation 0 0 2 32
shij13 0 0 2 32
-----------------------------------------------------------------------------------------
mchs_compbio 1 20 0 0
riedlio 1 20 0 0
-----------------------------------------------------------------------------------------
mcml 0 0 2 192
odenyogg 0 0 2 192
-----------------------------------------------------------------------------------------
nbody 73 292 182 470
ligo 73 292 182 470
-----------------------------------------------------------------------------------------
neurogroup 2 4 0 0
leeth2 1 1 0 0
roggeokk 1 3 0 0
-----------------------------------------------------------------------------------------
ng_lab 1 8 0 0
kimj119 1 8 0 0
-----------------------------------------------------------------------------------------
palmeri_lab 67 67 0 0
jeongj6 67 67 0 0
-----------------------------------------------------------------------------------------
p_csb_meiler 614 1510 8589 89489
belle6 27 27 970 970
resv146 6 12 0 0
seltmaa 136 136 1053 3654
tydingcw 0 0 6023 84322
yange8 445 1335 543 543
-----------------------------------------------------------------------------------------
p_dsi 1 16 3 3
malikm2 1 16 0 0
yangi1 0 0 3 3
-----------------------------------------------------------------------------------------
p_englot_group 10 300 8 240
makhoug 10 300 8 240
-----------------------------------------------------------------------------------------
p_masi 426 702 1631 1631
kimm58 300 450 1631 1631
saundam1 126 252 0 0
-----------------------------------------------------------------------------------------
p_meiler 0 0 2 7
kaermel 0 0 1 6
yange8 0 0 1 1
-----------------------------------------------------------------------------------------
rer 7 58 0 0
hum6 1 16 0 0
karomnj 5 40 0 0
theowc 1 2 0 0
-----------------------------------------------------------------------------------------
rke_group 1 16 0 0
maduren 1 16 0 0
-----------------------------------------------------------------------------------------
rokaslab 54 204 125 250
carvajj 1 5 0 0
copea1 1 1 0 0
danist 44 120 125 250
hatmakea 2 16 0 0
rangem1 2 16 0 0
riedlio 1 10 0 0
sautet1 3 36 0 0
-----------------------------------------------------------------------------------------
rubinov_lab 2 36 0 0
abbasia 1 30 0 0
sardarn 1 6 0 0
-----------------------------------------------------------------------------------------
ruderferlab 1 2 0 0
abehd1 1 2 0 0
-----------------------------------------------------------------------------------------
sarkar_lab 1 32 0 0
sarkah1 1 32 0 0
-----------------------------------------------------------------------------------------
sbcs 9 66 0 0
gunchiv 1 30 0 0
liq17 6 15 0 0
lyul1 1 1 0 0
xus15 1 20 0 0
-----------------------------------------------------------------------------------------
shah_lab 1 12 0 0
linp6 1 12 0 0
-----------------------------------------------------------------------------------------
stassun 0 0 1 60
medani 0 0 1 60
-----------------------------------------------------------------------------------------
stein_lab 1 8 50 59
karakg1 0 0 50 59
shellejp 1 8 0 0
-----------------------------------------------------------------------------------------
taylor_group 5 42 0 0
lambwg 4 40 0 0
schultls 1 2 0 0
-----------------------------------------------------------------------------------------
tk_lab 5 200 0 0
yoonh15 5 200 0 0
-----------------------------------------------------------------------------------------
vgi 21 284 0 0
gaow9 20 280 0 0
lifferjt 1 4 0 0
-----------------------------------------------------------------------------------------
walker_lab 4 80 2 2
buttc 3 48 0 0
kastnpd1 0 0 2 2
walkeas2 1 32 0 0
-----------------------------------------------------------------------------------------
wankowicz_lab 792 792 20778 20778
wankows 792 792 20778 20778
-----------------------------------------------------------------------------------------
williams_roberson_lab 1 1 0 0
yeohb1 1 1 0 0
-----------------------------------------------------------------------------------------
womelsdorf_lab 2 42 251 2604
azezewka 0 0 2 40
gerritcg 2 42 249 2564
-----------------------------------------------------------------------------------------
yang_lab_csb 2 40 0 0
jurichc 2 40 0 0
-----------------------------------------------------------------------------------------
zhu_group 0 0 1 32
zhuw12 0 0 1 32
-----------------------------------------------------------------------------------------
Totals: 2385 8813 36069 128426
Queue Summary (Batch GPU)
GROUP USER ACTIVE_JOBS ACTIVE_GPUS PENDING_JOBS PENDING_GPUS
-----------------------------------------------------------------------------------------
accre_guests_acc 1 1 0 0
liy110 1 1 0 0
-----------------------------------------------------------------------------------------
csb_gpu_acc 18 69 6 14
bisigp1 1 4 0 0
cryosparcuser 0 0 1 1
karadim 17 65 1 1
zhengm9 0 0 4 12
-----------------------------------------------------------------------------------------
h_vmac_acc 1 2 188 188
shashn1 0 0 188 188
yangy48 1 2 0 0
-----------------------------------------------------------------------------------------
mchaourab_acc 0 0 195126 195126
kaot1 0 0 195126 195126
-----------------------------------------------------------------------------------------
mchaourab-csb_acc 2 2 0 0
wut18 2 2 0 0
-----------------------------------------------------------------------------------------
mltf_acc 1 1 0 0
sohailu 1 1 0 0
-----------------------------------------------------------------------------------------
nbody_acc 1 8 0 0
khanfm 1 8 0 0
-----------------------------------------------------------------------------------------
p_meiler_acc 26 26 0 0
belle6 1 1 0 0
seltmaa 8 8 0 0
tydingcw 17 17 0 0
-----------------------------------------------------------------------------------------
psychology_gpu_acc 10 10 62 62
gerritcg 10 10 62 62
-----------------------------------------------------------------------------------------
taylor_group_acc 0 0 1 1
criswea 0 0 1 1
-----------------------------------------------------------------------------------------
wei_lab_acc 1 2 1 2
suw3 1 2 1 2
-----------------------------------------------------------------------------------------
Totals: 61 121 195384 195393
Queue Summary (interactive)
GROUP USER ACTIVE_JOBS ACTIVE_CORES PENDING_JOBS PENDING_CORES
-----------------------------------------------------------------------------------------
edwards_lab_int 1 4 0 0
seaglehm 1 4 0 0
-----------------------------------------------------------------------------------------
g_giri_group_int 2 8 0 0
breyem3 1 4 0 0
chenh45 1 4 0 0
-----------------------------------------------------------------------------------------
rubinov_lab_int 2 40 0 0
mohamb2 1 32 0 0
rubinom 1 8 0 0
-----------------------------------------------------------------------------------------
yang_lab_int 1 8 0 0
shaoq1 1 8 0 0
-----------------------------------------------------------------------------------------
Totals: 6 60 0 0
Queue Summary (interactive_gpu)
GROUP USER ACTIVE_JOBS ACTIVE_GPUS PENDING_JOBS PENDING_GPUS
-----------------------------------------------------------------------------------------
Totals: 0 0 0 0
Partition Summary
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
interactive up 14-00:00:0 1 drain cn0001
interactive up 14-00:00:0 4 mix cn[1301,1328,1804,1813]
interactive up 14-00:00:0 1 alloc cn1287
interactive up 14-00:00:0 20 idle cn[1302,1322-1326,1329-1330,1707,1800-1803,1805-1811]
batch* up 14-00:00:0 1 drain* cn1451
batch* up 14-00:00:0 1 down* cn1578
batch* up 14-00:00:0 4 drng cn[1271,1372,1484,1706]
batch* up 14-00:00:0 2 drain cn[1539,1626]
batch* up 14-00:00:0 230 mix cn[1202-1203,1205-1206,1209,1211-1213,1217-1218,1220-1221,1228-1229,1231-1233,1236,1242,1258-1259,1262,1272,1275,1279-1281,1283-1285,1290,1295,1299,1308-1309,1312,1314,1317-1318,1320,1331-1334,1336-1337,1340,1349,1354,1357,1359-1360,1362,1364,1369,1373-1379,1383,1388-1392,1396,1398-1399,1402-1403,1405-1412,1414-1427,1430-1432,1434-1435,1437,1440-1443,1446-1447,1450,1452-1457,1461-1463,1466-1469,1471-1473,1475-1476,1478-1479,1481-1483,1486-1487,1489-1494,1496-1497,1499,1502,1504-1507,1509,1511,1513-1519,1522,1524-1527,1530,1532-1538,1540,1543-1544,1546-1549,1551-1557,1559,1562,1564,1566-1567,1570-1571,1573-1575,1577,1579-1586,1588-1589,1593,1595-1596,1602-1603,1606,1608-1610,1614-1615,1617-1618,1620,1622,1624,1627,1629-1631,1633,1700-1701,1703,1705,1708]
batch* up 14-00:00:0 153 alloc cn[1204,1207-1208,1210,1215-1216,1219,1222-1227,1230,1234-1235,1237-1241,1257,1260-1261,1264-1270,1273-1274,1276-1278,1282,1286,1288-1289,1291-1294,1296-1298,1303-1307,1310-1311,1313,1315-1316,1321,1327,1335,1338-1339,1341-1348,1350-1353,1355,1358,1361,1363,1365-1368,1370-1371,1380-1382,1384-1385,1387,1393-1395,1397,1400-1401,1404,1436,1438-1439,1445,1448-1449,1458,1460,1464,1470,1474,1477,1480,1485,1488,1495,1498,1500-1501,1503,1508,1510,1512,1520,1523,1528-1529,1531,1545,1550,1558,1561,1563,1565,1568-1569,1576,1587,1592,1594,1597,1604-1605,1607,1612-1613,1616,1619,1621,1623,1625,1628,1632,1702,1704,2000]
batch_gpu up 14-00:00:0 4 inval gpu[0038,0040,0046,0060]
batch_gpu up 14-00:00:0 11 drain* gpu[0024,0035-0036,0041,0044-0045,0048-0049,0053,0057,0084]
batch_gpu up 14-00:00:0 2 drng gpu[0063,0067]
batch_gpu up 14-00:00:0 6 drain gpu[0037,0056,0061,0064,0066,0068]
batch_gpu up 14-00:00:0 2 resv gpu[0062,0065]
batch_gpu up 14-00:00:0 33 mix gpu[0013,0026-0027,0033-0034,0039,0042,0050,0059,0069-0082,0085,0300-0306],gracehopper[01-02]
batch_gpu up 14-00:00:0 14 idle gpu[0015,0017-0023,0307-0310],hgx[01-02]
interactive_gpu up 14-00:00:0 1 drain dgx04
interactive_gpu up 14-00:00:0 2 mix dgx[01,03]
interactive_gpu up 14-00:00:0 3 idle dgx02,gpu[0058,0207]
sam up 2-02:00:00 1 alloc cms-sam-01
sam up 2-02:00:00 1 idle cms-sam-02