Skip to content

Commit 02f79ea

Browse files
committed
[InstSimplify] fold extracting from std::pair (1/2)
This patch intends to enable jump threading when a method whose return type is std::pair<int, bool> or std::pair<bool, int> is inlined. For example, jump threading does not happen for the if statement in func. std::pair<int, bool> callee(int v) { int a = dummy(v); if (a) return std::make_pair(dummy(v), true); else return std::make_pair(v, v < 0); } int func(int v) { std::pair<int, bool> rc = callee(v); if (rc.second) { // do something } SROA executed before the method inlining replaces std::pair by i64 without splitting in both callee and func since at this point no access to the individual fields is seen to SROA. After inlining, jump threading fails to identify that the incoming value is a constant due to additional instructions (like or, and, trunc). This series of patch add patterns in InstructionSimplify to fold extraction of members of std::pair. To help jump threading, actually we need to optimize the code sequence spanning multiple BBs. These patches does not handle phi by itself, but these additional patterns help NewGVN pass, which calls instsimplify to check opportunities for simplifying instructions over phi, apply phi-of-ops optimization to result in successful jump threading. SimplifyDemandedBits in InstCombine, can do more general optimization but this patch aims to provide opportunities for other optimizers by supporting a simple but common case in InstSimplify. This first patch in the series handles code sequences that merges two values using shl and or and then extracts one value using lshr. Differential Revision: https://reviews.llvm.org/D48828 llvm-svn: 338485
1 parent 2089e4c commit 02f79ea

File tree

2 files changed

+19
-10
lines changed

2 files changed

+19
-10
lines changed

‎llvm/lib/Analysis/InstructionSimplify.cpp

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1325,6 +1325,23 @@ static Value *SimplifyLShrInst(Value *Op0, Value *Op1, bool isExact,
13251325
if (match(Op0, m_NUWShl(m_Value(X), m_Specific(Op1))))
13261326
return X;
13271327

1328+
// ((X << A) | Y) >> A -> X if effective width of Y is not larger than A.
1329+
// We can return X as we do in the above case since OR alters no bits in X.
1330+
// SimplifyDemandedBits in InstCombine can do more general optimization for
1331+
// bit manipulation. This pattern aims to provide opportunities for other
1332+
// optimizers by supporting a simple but common case in InstSimplify.
1333+
Value *Y;
1334+
const APInt *ShRAmt, *ShLAmt;
1335+
if (match(Op1, m_APInt(ShRAmt)) &&
1336+
match(Op0, m_c_Or(m_NUWShl(m_Value(X), m_APInt(ShLAmt)), m_Value(Y))) &&
1337+
*ShRAmt == *ShLAmt) {
1338+
const KnownBits YKnown = computeKnownBits(Y, Q.DL, 0, Q.AC, Q.CxtI, Q.DT);
1339+
const unsigned Width = Op0->getType()->getScalarSizeInBits();
1340+
const unsigned EffWidthY = Width - YKnown.countMinLeadingZeros();
1341+
if (EffWidthY <= ShRAmt->getZExtValue())
1342+
return X;
1343+
}
1344+
13281345
return nullptr;
13291346
}
13301347

‎llvm/test/Transforms/InstSimplify/shift.ll

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -178,11 +178,7 @@ define <2 x i8> @shl_by_sext_bool_vec(<2 x i1> %x, <2 x i8> %y) {
178178
define i64 @shl_or_shr(i32 %a, i32 %b) {
179179
; CHECK-LABEL: @shl_or_shr(
180180
; CHECK-NEXT: [[TMP1:%.*]] = zext i32 [[A:%.*]] to i64
181-
; CHECK-NEXT: [[TMP2:%.*]] = zext i32 [[B:%.*]] to i64
182-
; CHECK-NEXT: [[TMP3:%.*]] = shl nuw i64 [[TMP1]], 32
183-
; CHECK-NEXT: [[TMP4:%.*]] = or i64 [[TMP2]], [[TMP3]]
184-
; CHECK-NEXT: [[TMP5:%.*]] = lshr i64 [[TMP4]], 32
185-
; CHECK-NEXT: ret i64 [[TMP5]]
181+
; CHECK-NEXT: ret i64 [[TMP1]]
186182
;
187183
%tmp1 = zext i32 %a to i64
188184
%tmp2 = zext i32 %b to i64
@@ -214,11 +210,7 @@ define i64 @shl_or_shr2(i32 %a, i32 %b) {
214210
define <2 x i64> @shl_or_shr1v(<2 x i32> %a, <2 x i32> %b) {
215211
; CHECK-LABEL: @shl_or_shr1v(
216212
; CHECK-NEXT: [[TMP1:%.*]] = zext <2 x i32> [[A:%.*]] to <2 x i64>
217-
; CHECK-NEXT: [[TMP2:%.*]] = zext <2 x i32> [[B:%.*]] to <2 x i64>
218-
; CHECK-NEXT: [[TMP3:%.*]] = shl nuw <2 x i64> [[TMP1]], <i64 32, i64 32>
219-
; CHECK-NEXT: [[TMP4:%.*]] = or <2 x i64> [[TMP3]], [[TMP2]]
220-
; CHECK-NEXT: [[TMP5:%.*]] = lshr <2 x i64> [[TMP4]], <i64 32, i64 32>
221-
; CHECK-NEXT: ret <2 x i64> [[TMP5]]
213+
; CHECK-NEXT: ret <2 x i64> [[TMP1]]
222214
;
223215
%tmp1 = zext <2 x i32> %a to <2 x i64>
224216
%tmp2 = zext <2 x i32> %b to <2 x i64>

0 commit comments

Comments
 (0)