How would you start the diagnostics if you see the following error:
RDBMS Alert log
ORA-27063: number of bytes read/written is incorrect WARNING: IO Failed. subsys:System dg:1, diskname:/dev/roralv disk:0x0.0xdb804e82 au:408959 iop:0x1106e4b50 bufp:0x116844000 offset(bytes):428825575424 iosz:65536 operation:2(Write) synchronous:0 result: 4 osderr:0x0 osderr1:0x0 pid:1421568 Errors in file /oracle/diag/rdbms/bi1/bi1/trace/bi1_ora_1421568.trc: ORA-15080: synchronous I/O operation to a disk failed WARNING: failed to write mirror side 1 of virtual extent 48509 logical extent 0 of file 287 in group 1 on disk 0 allocation unit 408959 ORA-1114 : opiodr aborting process unknown ospid (1421568_1) Errors in file /oracle/diag/rdbms/bi1/bi1/trace/bi1_ora_1421568.trc (incident=46471): ORA-00600: internal error code, arguments: [kfmdSlvLeaveWrt1], , , , , , , , , , ,  ORA-01114: IO error writing block to file 207 (block # 3104636) ORA-15081: failed to submit an I/O operation to a disk
ASM Alert log
ORA-27072: File I/O error Additional information: 7 Additional information: 837548032 Additional information: 1044480 WARNING: IO Failed. subsys:System dg:1, diskname:/dev/roralv disk:0x0.0xdb804e82 au:408959 iop:0x110a51dd0 bufp:0x110615c00 offset(bytes):428824592384 iosz:1048576 operation:1(Read) synchronous:1 result: 4 osderr:0x0 osderr1:0x0 pid:2580822
You might look at :
a.) Tracefiles accompanying the error. It would provide you the Session information, the Failing SQL, Call Stack and loads of other information. Initial attempt might be to Re-run the failing sql to see if the error persists. If yes, followed by ‘analyze table’ on the underlying tables used in the query.
b.) Another option is to find out what is file# 207, a datafile or a tempfile ? Based on the db_files parameter you can get that information. Would running dbv prove beneficial ? If it is a tempfile, is it corrupted ? Would dropping the tempfile help ?
c.) Looking at the warning messages in the alert log, you might be tempted to think that its a hardware problem. Because it is a write error, has someone changed the permissions of the disk (/dev/roralv) ?
While all of the above information is helpful to check the sanity of system, but it is not real cause of problem.
Issue was due to insufficient free diskspace on the diskgroup (using external redundancy) where tempfile ( file# 200 + 7) was present. Tempfile was autoextensible with maxsize set to unlimited. Because the error was reported on a test system which is not monitored by the monitoring server, warning alarms for diskgroup usage were not created.
Point here is: why not report a simple ORA-01652 in the alert log of RDBMS or an ORA-15041 in ASM alert log rather than a bunch of scary misleading messages ?