| Age | Commit message (Collapse) | Author |
|
cam::mpr:complete(union ccb *, struct mpr_command *, u_int, u32);
Where u_int is scsas->flags u32 is the device_info.
This can't be done as an fbt because the data needed for it isn't
present out a function boundary.
Sponsored by: Netflix
|
|
These are left over from the $FreeBSD$ stuff.
Sponsored by: Netflix
|
|
These were a doodle that escaped into my staging tree. Remove them.
Sponsored by: Netflix
|
|
When looking for the boot_params symbol we need to get the UEFI memory
map, use host: prefix. The short-circuit we have for this only works
when we have a filesystem. During the earliest parts of boot, we can
sometimes not have this yet, so making this explicit allows these
environments to function.
It's always in the host path. Print better
error messages, and add newlines in two palces.
Sponsored by: Netflix
|
|
This patch adds a power_condition parameter to the
scsi_start_stop() function and sets the power condition via SSU.
Reviewed by: imp (mentor)
Sponsored by: Samsung Electronic
Differential Revision: https://reviews.freebsd.org/D53922
|
|
This patch adds a /boot/loader.conf setting that makes it possibly to
override the detected number of slots in storage enclosures. Some (yes
I'm looking at you HPE D6020!) reports less available slots that there
actually are (the D6020 seems to report 18 but actually has 35 per
drawer). This causes the mpr driver to have problems detecting/managing
all drives in a multienclosure setting. For the D6020 this occurs when
connecting two or more fully equipped (140 drives) enclosures to one
controller...
This problem can be "fixed" by adding the following to /boot/loader.conf
and rebooting:
hw.mpr.encl_min_slots="35"
Note: I (Warner) don't have this hardware to see if there's some way to
fix the detection, so I'm committing this as a stop-gap. It's a no-op if
no tunable is set.
PR: 271238
Reivewed by: imp
|
|
Switch to using sys/stdarg.h for va_list type and va_* builtins.
Make an attempt to insert the include in a sensible place. Where
style(9) was followed this is easy, where it was ignored, aim for the
first block of sys/*.h headers and don't get too fussy or try to fix
other style bugs.
Reviewed by: imp
Exp-run by: antoine (PR 286274)
Pull Request: https://github.com/freebsd/freebsd-src/pull/1595
|
|
Sometimes, especially with older firmware, mps(4) would have trouble
initializing the card in one of these two steps. Add in a retry after a
short delay. Sean Bruno and Stephen McConnell thought this was OK in the
bug discussions, but never committed it. Steve indicated the delay
might not be necessary, but the OP clearly needed to make it longer to
make things work. I've kept the delay, and added the suggested comment.
Ported the iocfacts part to mpr as well, since we see similar errors
about once every month or two over a few thousand controllers at
work. We've not seen it with IOC_INIT as far back as I can query the
error log database, so I didn't port that forward. We'll see if this
helps, but won't know for sure until next year (so I'm committing it now
since it won't hurt and might help). We usually see this failure in
connection with complicated recovery operations with a drive that's
failing, though, at least in the last year's worth of failures. It's
not clear this is the same as OP or not.
PR: 212841
Sponsored by: Netflix
Co-authored-by: imp
|
|
Reviewed by: oshogbo, imp
Sponsored by: Klara, Inc.
Sponsored by: Datazap
Differential Revision: https://reviews.freebsd.org/D43505
|
|
In preparation for adding a __result_use_check annotation to copyin()
and related functions, start checking for errors from copyout() in
the mpr(4) user command handler. This should make it easier to catch
bugs.
Reviewed by: imp, asomers
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D43177
|
|
Most all of the memory used by the cards in the mpr(4) and mps(4)
drivers is required, according to the specs and Broadcom developers,
to be within a 4GB segment of memory.
This includes:
System Request Message Frames pool
Reply Free Queues pool
ReplyDescriptorPost Queues pool
Chain Segments pool
Sense Buffers pool
SystemReply message pool
We got a bug report from Dwight Engen, who ran into data corruption
in the BAE port of FreeBSD:
> We have a port of the FreeBSD mpr driver to our kernel and recently
> I found an issue under heavy load where a DMA may go to the wrong
> address. The test system is a Supermicro X10SRH-CLN4F with the
> onboard SAS3008 controller setup with 2 enterprise Micron SSDs in
> RAID 0 (striped). I have debugged the issue and narrowed down that
> the errant DMA is one that has a segment that crosses a 4GB
> physical boundary. There are more details I can provide if you'd
> like, but with the attached patch in place I can no longer
> re-create the issue.
> I'm not sure if this is a known limit of the card (have not found a
> datasheet/programming docs for the chip) or our system is just
> doing something a bit different. Any helpful info or insight would
> be welcome.
> Anyway, just thought this might be helpful info if you want to
> apply a similar fix to FreeBSD. You can ignore/discard the commit
> message as it is my internal commit (blkio is our own tool we use
> to write/read every block of a device with CRC verification which
> is how I found the problem).
The commit message was:
> [PATCH 8/9] mpr: fix memory corrupting DMA when sg segment crosses
> 4GB boundary
> Test case was two SSD's in RAID 0 (stripe). The logical disk was
> then partitioned into two partitions. One partition had lots of
> filesystem I/O and the other was initially filled using blkio with
> CRCable data and then read back with blkio CRC verify in a loop.
> Eventually blkio would report a bad CRC block because the physical
> page being read-ahead into didn't contain the right data. If the
> physical address in the arq/segs was for example 0x500003000 the
> data would actually be DMAed to 0x400003000.
The original patch was against mpr(4) before busdma templates were
introduced, and only affected the buffer pool (sc->buffer_dmat) in
the mpr(4) driver. After some discussion with Dwight and the
LSI/Broadcom developers and looking through the driver, it looks
like most of the queues in the driver are ok, because they limit
the memory used to memory below 4GB. The buffer queue and the chain
frames seem to be the exceptions.
This is pretty much the same between the mpr(4) and mps(4) drivers.
So, apply a 4GB boundary limitation for the buffer and chain frame pools
in the mpr(4) and mps(4) drivers.
Reported by: Dwight Engen <dwight.engen@gmail.com>
Reviewed by: imp
Obtained from: Dwight Engen <dwight.engen@gmail.com>
Differential Revision: <https://reviews.freebsd.org/D43008>
|
|
xpt_path_string() is now a wrapper around xpt_path_sbuf(). Using it
to than concatenate result to another sbuf makes no sense. Just call
xpt_path_sbuf() directly.
MFC after: 1 month
|
|
Depending on the card's firmware version, it may return different length
responses for MPI2_FUNCTION_IOC_FACTS. But the first part of the
response contains the length of the rest, so query it first to get the
length and then use that to size the buffer for the full response.
Also, correctly zero-initialize MPI2_IOC_FACTS_REQUEST. It only worked
by luck before.
PR: 264848
Reported by: Julien Cigar <julien@perdition.city>
MFC after: 1 week
Sponsored by: Axcient
Reviewed by: scottl, imp
Differential Revision: https://reviews.freebsd.org/D38739
|
|
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
|
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
|
moving -> removing (we're removing the device)
CAM_REQ_CMO_ERROR -> CAM_REQ_ERR (the former isn't a thing)
Sponsored by: Netflix
|
|
Pointed out by: imp
Sponsored by: Klara Inc.
|
|
Before the commit 6cc44223cb6717795afdac4348bbe7e2a968a07d the
field event_mask was fully copied to the EventMasks field.
After this commit the event_mask (uint8_t) is 4 times casted to
EventMask (uint32_t). Because of that 24 bits of each event_mask array
is lost.
This commits brings back simple copying of field, and after words
converting 32 bits field to the requested endian.
I don't think we need more sophisticated method,
as the array is of size 4 (for 32 bits version).
Reviewed by: imp
MFC after: 1 week
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D39562
|
|
In every mpr and mps ioctl that copies kernel data to userland, validate
that the requested length does not exceed the size of the kernel's
buffer.
Note that all of these ioctls already required root access.
MFC after: 2 weeks
Sponsored by: Axcient
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D38842
|
|
Issue Description:
The RequestCredits field of IOCFacts got changed between the Phase23
firmware to Phase24 firmware. So as part of firmware update operation,
driver has to free the resources & pools which are created with the Phase23
Firmware's IOCFacts data (i.e. during driver load time) and has to
reallocate the resources and pools using Phase24's IOCFacts data. Here
driver has freed the interrupts but missed to reallocate the interrupts and
hence config page read operation is getting timed out and controller is
going for recursive reinit (controller reset) operations and leading to
kernel panic.
Fix:
Reallocate the interrupts if the interrupts are disabled as part of
firmware update/downgrade operation.
Submitted by: Sreekanth Ready <sreekanth.reddy@broadcom.com>
Tested by: ken
MFC after: 3 days
|
|
- s/the the/the/
MFC after: 3 days
|
|
|
|
It's possible for muliple drives to be departing at the same time (if
the common power rail the share goes dark, for example). To understand
what's going on better, include target and handle in the messages
announcing the reset to allow matching with other corresponding events.
MFC After: 3 days
Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D35092
|
|
When we can't load a request due to a shortage of chains, we complete
the command's cm. However, to avoid an assert in mp?_complete_command,
we transition its state to INQUEUE. This transition is legitimate
because this is the only error path that terminates a cm before it's
enqueued and the only other alternative would be an additional transient
state that would add complexity w/o adding value. Add a comment
explainging all this because otherwise the transition can look a bit
weird.
Sponsored by: Netflix
|
|
Diff reduction between mpr and mps.
Fixes: e2997a03b7f7 ("Diagnostic buffer fixes for the mps(4)...")
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
|
|
*_CFG_PAGE ioctl handlers in the mpr, mps, and mpt drivers allocated a
buffer of a caller-specified size, but copied to it a fixed size header.
Add checks that the size is at least the required minimum.
Note that the device nodes are owned by root:operator with 0640
permissions so the ioctls are not available to unprivileged users.
This change includes suggestions from scottl, markj and mav.
Two of the mpt cases were reported by Lucas Leong (@_wmliang_) of
Trend Micro Zero Day Initiative; scottl reported the third case in mpt.
Same issue found in mpr and mps after discussion with imp.
Reported by: Lucas Leong (@_wmliang_), Trend Micro Zero Day Initiative
Reviewed by: imp, mav
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34692
|
|
MFC after: 1 week
|
|
There's a small race in freezing the simq when performing a diagnostic
reset. During this time, a transaction can slip through and encounter
the target id of 0. If we're still in diagnostic reset when we detect
this, return a CAM_DEVICE_NOT_THERE status. Instead, freeze the queue
and return a requeue status, similar to what we do when we're resetting
a target and a transaction get here. The race is unavoidable due to
separate locks for queue and SIM, but easy enough to detect and make
harmless.
Sponsored by: Netflix
Reviewed by: scottl, mav
Differential Revision: https://reviews.freebsd.org/D34017
|
|
The discovery callout is initialized and cancelled only, making it
write-only. Remove a state flag associated with it being pending as well
as two defines that aren't used that are associated with it. Remove
MP?SAS_SHUTDOWN flag, which is unused.
Sponsored by: Netflix
Reviewed by: ken, scottl, mav
Differential Revision: https://reviews.freebsd.org/D33925
|
|
It does not matter how often do we check firmware for crashes.
MFC after: 2 weeks
|
|
- s/segement/segment/
MFC after: 3 days
|
|
bug in error handling.
|
|
So, if we're processing a timeout, and we've sent an ABORT to the
firmware for that timeout, but not yet received the response from the
firmware, AND we get another timeout, we queue the timeout and freeze
the queue. However, when we've finally processed them all, we only
release the queue once. This causes all I/O to halt as the devq remains
frozen forever.
Instead, only freeze the queue when we start the process (eg set INRESET
on the target). This will allow the release when all the timed out I/Os
have finished ABORTing.
Sponsored by: Netflix
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D33054
|
|
Minor reformatting nits to make mprsas_scsiio_timeout match
mpssas_scsiio_timeout more closely. The differences aren't necessary and
are distracting when comparing the routines. No functional changes.
Sponsored by: Netflix
|
|
It fixes lock ordere reversal between SIM and device locks. Also
remove registration for AC_FOUND_DEVICE, unused for a while now.
MFC after: 1 month
|
|
Before this change devq was frozen only if some command was sent to
the target after reset started, but release was called always. This
change freezes the devq immediately, leaving mprsas_action_scsiio()
check only to cover race condition due to different lock devq use.
This should also avoid unnecessary requeue of the commands, creating
additional log noise and confusing some broken apps.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
|
|
SAS9305-16e with firmware 16.00.01.00 report HighPriorityCredit of
only 8, while for comparison some other combinations I have report
100 or even 128. In case of large JBOD detach requirement to send
target reset command to each target same time overflows the limit,
and without adequate handling makes devices stuck in half-detached
state, preventing later re-attach.
To handle that in case of allocation error mark the target with new
MPRSAS_TARGET_TOREMOVE flag, and retry the removal attempt next time
something else free high priority command. With this patch I can
successfully detach/attach 102 disk JBOD from/to the SAS9305-16e.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
|
|
MFC after: 2 weeks
|
|
When a DMA chain can't be loaded, set the state to STATE_INQUEUE so that
the mp[rs]_complete_command can properly fail the command.
Sponsored by: Netflix
|
|
When the mpr(4) and mps(4) drivers probe a SATA device, they issue an
ATA Identify command (via mp{s,r}sas_get_sata_identify()) before the
target is fully setup in the driver. The drivers wait for completion of
the identify command, and have a 5 second timeout. If the timeout
fires, the command is marked with the SATA_ID_TIMEOUT flag so it can be
freed later.
That is where the use-after-free problem comes in. Once the ATA
Identify times out, the driver sends a target reset, and then frees any
identify commands that have timed out. But, once the target reset
completes, commands that were queued to the drive are returned to the
driver by the controller.
At that point, the driver (in mp{s,r}_intr_locked()) looks up the
command descriptor for that particular SMID, marks it CM_STATE_BUSY and
sends it on for completion handling.
The problem at this stage is that the command has already been freed,
and put on the free queue, so its state is CM_STATE_FREE. If INVARIANTS
are turned on, we get a panic as soon as this command is allocated,
because its state is no longer CM_STATE_FREE, but rather CM_STATE_BUSY.
So, the solution is to not free ATA Identify commands that get stuck
until they actually return from the controller. Hopefully this works
correctly on older firmware versions. If not, it could result in
commands hanging around indefinitely. But, the alternative is a
use-after-free panic or assertion (in the INVARIANTS case).
This also tightens up the state transitions between CM_STATE_FREE,
CM_STATE_BUSY and CM_STATE_INQUEUE, so that the state transitions happen
once, and we have assertions to make sure that commands are in the
correct state before transitioning to the next state. Also, for each
state assertion, we print out the current state of the command if it is
incorrect.
mp{s,r}.c: Add a new sysctl variable, dump_reqs_alltypes,
that controls the behavior of the dump_reqs sysctl.
If dump_reqs_alltypes is non-zero, it will dump
all commands, not just the commands that are in the
CM_STATE_INQUEUE state. (You can see the commands
that are in the queue by using mp{s,r}util debug
dumpreqs.)
Make sure that the INQUEUE -> BUSY state transition
happens in one place, the mp{s,r}_complete_command
routine.
mp{s,r}_sas.c: Make sure we print the current command type in
command state assertions.
mp{s,r}_sas_lsi.c:
Add a new completion handler,
mp{s,r}sas_ata_id_complete. This completion
handler will free data allocated for an ATA
Identify command and free the command structure.
In mp{s,r}_ata_id_timeout, do not set the command
state to CM_STATE_BUSY. The command is still in
queue in the controller. Since we were blocking
waiting for this command to complete, there was
no completion handler previously. Set the
completion handler, so that whenever the command
does come back, it will get freed properly.
Do not free ATA Identify commands that have timed
out in mp{s,r}sas_add_device(). Wait for them
to actually come back from the controller.
mp{s,r}var.h: Add a dump_reqs_alltypes variable for the new
dump_reqs_alltypes sysctl.
Make sure we print the current state for state
transition asserts.
This was tested in the Spectra Logic test bed (as described in the
review), as well Netflix's Open Connect fleet (where panics dropped from
a dozen or two a month to zero).
Reviewed by: imp@ (who is handling the commit with ken's OK)
Sponsored by: Spectra Logic
Differential Revision: https://reviews.freebsd.org/D25476
|
|
Reviewed By: imp
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D30301
|
|
Allow new enclosure to replace previously existing one if there is
no completely unused table entry, same as it is done for devices.
If we can not process DPM due to corruption -- wipe it and restart
from scratch. Otherwise I don't see a way to recover persistence if
something go wrong and there is no BIOS to recover it for us.
Together this solves a problem that appeared when 9300-8i firmware
update to 16.00.10.00 somehow switched its mapping mode from Device
Persistence to Enclosure/Slot without wiping the DPM table. It made
HBA completely unusable, since overflowed and conflicting mapping
table was unable to map any of enclosures and so devices.
Also while there make some enclosure mapping errors more informative.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
|
|
Reviewed by: imp, kib
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29205
|
|
This fixes mpr driver on big-endian devices.
Tested on powerpc64 and powerpc64le targets using a SAS9300-8i card
(LSISAS3008 pci vendor=0x1000 device=0x0097)
Submitted by: Andre Fernando da Silva <andre.silva@eldorado.org.br>
Reviewed by: luporl, alfredo, Sreekanth Reddy <sreekanth.reddy@broadcom.com> (by email)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25785
|
|
The device mapping table contains sc->max_devices entries, so only
indices in [0, sc->max_devices) are valid.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27964
|
|
Previously we copied in the request into a stack-allocated structure
that could be smaller than the request size. Furthermore, we checked
the request size only after doing the copyin.
Fix this by allocating a buffer to hold the request, then copying the
buffer's contents into a command descriptor. This is a bit heavy-handed
but I expect the overhead will not be noticeable. The approach of
coping the header in first is susceptible to TOCTOU problems.
Reviewed by: imp
Reported by: maxpl0it@protonmail.com
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27963
|
|
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
Notes:
svn path=/head/; revision=368124
|
|
SAM-3 specification introduced concept of Task Priority, that was renamed
to Command Priority in SAM-4, and supported by all modern SCSI transports.
It provides 15 levels of relative priorities: 1 - highest, 15 - lowest and
0 - default. SAT specification for SATA devices translates priorities 1-3
into NCQ high priority.
This change adds new "priority" field into empty spots of struct ccb_scsiio
and struct ccb_accept_tio of CAM and struct ctl_scsiio of CTL. Respective
support is added into iscsi(4), isp(4), mpr(4), mps(4) and ocs_fc(4) drivers
for both initiator and where applicable target roles. Minimal support was
added to CTL to receive the priority value from different frontends, pass it
between HA controllers and report in few places.
This patch does not add consumers of this functionality, so nothing should
really change yet, since the field is still set to 0 (default) on initiator
and not actively used on target. Those are to be implemented separately.
I've confirmed priority working on WD Red SATA disks connected via mpr(4)
and properly transferred to CTL target via iscsi(4), isp(4) and ocs_fc(4).
While there, added missing tag_action support to ocs_fc(4) initiator role.
MFC after: 1 month
Relnotes: yes
Sponsored by: iXsystems, Inc.
Notes:
svn path=/head/; revision=367044
|
|
functional change.
Notes:
svn path=/head/; revision=366668
|
|
that can be extended, but also ensure compile-time type checking. Refactor
common code out of arch-specific implementations. Move the mpr and mps
drivers to this new API. The template type remains visible to the consumer
so that it can be allocated on the stack, but should be considered opaque.
Notes:
svn path=/head/; revision=365706
|